Background Severe COVID-19 is characterised by fever, cough, and dyspnoea. Symptoms affecting other organ systems have been reported. However, it is the clinical associations of different patterns of symptoms which influence diagnostic and therapeutic decision-making. In this study, we applied simple machine learning techniques to a large prospective cohort of hospitalised patients with COVID-19 to identify clinically meaningful sub-groups. Methods We obtained structured clinical data on 59 011 patients in the UK (the ISARIC Coronavirus Clinical Characterisation Consortium, 4C) and used a principled, unsupervised clustering approach to partition the first 25 477 cases according to symptoms reported at recruitment. We validated our findings in a second group of 33 534 cases recruited to ISARIC-4C, and in 4 445 cases recruited to a separate study of community cases. Findings Unsupervised clustering identified distinct sub-groups. First, a core symptom set of fever, cough, and dyspnoea, which co-occurred with additional symptoms in three further patterns: fatigue and confusion, diarrhoea and vomiting, or productive cough. Presentations with a single reported symptom of dyspnoea or confusion were also identified, alongside a subgroup of patients reporting few or no symptoms. Patients presenting with gastrointestinal symptoms were more commonly female, had a longer duration of symptoms before presentation, and had lower 30-day mortality. Patients presenting with confusion, with or without core symptoms, were older and had a higher unadjusted mortality. Symptom clusters were highly consistent in replication analysis within the ISARIC-4C study. Similar patterns were externally verified in 4 445 patients from a study of self-reported symptoms of mild disease. Interpretation The large scale of the ISARIC-4C study enabled robust, granular discovery and replication of patient clusters. Clinical interpretation is necessary to determine which of these observations have practical utility. We propose that four patterns are usefully distinct from the core symptom group: gastro-intestinal disease, productive cough, confusion, and pauci-symptomatic presentations. Importantly, each is associated with an in-hospital mortality which differs from that of patients with core symptoms. These observations deepen our understanding of COVID-19 and will influence clinical diagnosis, risk prediction, and future mechanistic and clinical studies.