Frederick Sanger determined the complete amino acid sequence of bovine insulin in 1955, a landmark achievement in biochemistry. He revealed that insulin consists of two polypeptide chains, an A chain with 21 amino acids and a B chain with 30 amino acids, linked by two disulfide bonds. This was the first protein to be fully sequenced, proving proteins have specific structures.
Frederick Sanger’s work on sequencing insulin was a monumental task that took over a decade to complete and fundamentally changed our understanding of proteins. At the time, it was not universally accepted that proteins had a defined chemical structure. Sanger’s approach was methodical and innovative. He first separated the A and B chains by cleaving the disulfide bonds that link them. Then, he used a reagent he developed, 1-fluoro-2,4-dinitrobenzene (now known as Sanger’s reagent), to label the N-terminal amino acid of the polypeptide chains. By hydrolyzing the protein and identifying the labeled amino acid, he could determine the start of the sequence. To sequence the rest of the chain, he used partial hydrolysis with acids and enzymes to break the chains into smaller, overlapping peptide fragments. He then painstakingly separated these fragments using chromatography and electrophoresis and determined the sequence of each small piece. By identifying the overlapping sequences between different fragments, he could piece them together like a jigsaw puzzle to deduce the full sequence of both the A and B chains. Finally, he determined the positions of the three disulfide bonds (two inter-chain, one intra-chain on the A chain). This work not only earned him his first Nobel Prize in Chemistry in 1958 but also provided definitive proof for the “sequence hypothesis”—that the amino acid sequence of a protein dictates its three-dimensional structure and, consequently, its biological function.