Cleansing of UK Address Postcodes in SQL Server 2008

Create a function to cleanse and re-format UK Postcodes in SQL Server

Context

This article has been quite popular, so I have updated it with some more information and cleaned up the formatting, as a couple of spaces were missed out, so it can be copied straight into SSMS now.

It uses the following steps;

  • Remove everything from the text string (including spaces) apart from numbers and letters
  • Replaces the standard typing mistakes like;
  • Portsmouth is PO, Colchester is CO, but sometimes gets entered as P0 or C0 (a zero)
  • Ilford and Reading are IG and RG, but sometime get entered with a 6 in place of the G
  • Work out format the postcode should be in, and insert a space where it should be.
  • Check that the format is correct and return a new value if true, else return the old format.

Please note, this does not check whether the post code actually exists, and further reading can be found on the following link Postcodes in the United Kingdom.

SQL

CREATE FUNCTION CleansePostCode (@PostCode VARCHAR(100))
 RETURNS VARCHAR(100)
 AS BEGIN

 DECLARE @OldPostCode VARCHAR(100)
 SET @OldPostCode=LTRIM(RTRIM(@PostCode))
  
 --Clean to Numbers and Letters
 DECLARE @Letter INT
   SET @Letter PATINDEX('%[^0-Z]%',@PostCode)
   BEGIN
   WHILE @Letter>0
   BEGIN
   SET @PostCode STUFF(@PostCode,@Letter,1,'')
   SET @Letter PATINDEX('%[^0-Z]%',@PostCode)
   END
 END
  
--Replace obvious errors
SET @PostCode =(CASE WHEN LEFT(@PostCode,2)='P0' THEN STUFF(@PostCode,1,2,'PO')
 WHEN LEFT(@PostCode,2)='C0' THEN STUFF(@PostCode,1,2,'CO')
 WHEN LEFT(@PostCode,2)='I6' THEN STUFF(@PostCode,1,2,'IG')
 WHEN LEFT(@PostCode,2)='HO' THEN STUFF(@PostCode,1,2,'HD')
 WHEN LEFT(@PostCode,2)='C8' THEN STUFF(@PostCode,1,2,'CB')
 WHEN LEFT(@PostCode,2)='D0' THEN STUFF(@PostCode,1,2,'DO')
 WHEN LEFT(@PostCode,2)='H5' THEN STUFF(@PostCode,1,2,'HS')
 WHEN LEFT(@PostCode,2)='0L' THEN STUFF(@PostCode,1,2,'OL')
 WHEN LEFT(@PostCode,2)='0X ' THEN STUFF(@PostCode,1,2,'OX')
 WHEN LEFT(@PostCode,2)='P0' THEN STUFF(@PostCode,1,2,'PO')
 WHEN LEFT(@PostCode,2)='R6' THEN STUFF(@PostCode,1,2,'RG')
 ELSE @PostCode END)
  
 --Works out correct pattern and insert space
 SET @PostCode = (CASE WHEN @PostCode LIKE '[A-Z][A-Z][0-9][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,3)+' '+RIGHT(@PostCode,3)
 WHEN @PostCode LIKE '[A-Z][0-9][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,2)+' '+RIGHT(@PostCode,3)
 WHEN @PostCode LIKE '[A-Z][0-9][0-9][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,3)+' '+RIGHT(@PostCode,3)
 WHEN @PostCode LIKE '[A-Z][0-9][A-Z][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,3)+' '+RIGHT(@PostCode,3)
 WHEN @PostCode LIKE '[A-Z][A-Z][0-9][0-9][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,4)+' '+RIGHT(@PostCode,3)
 WHEN @PostCode LIKE '[A-Z][A-Z][0-9][A-Z][0-9][A-Z][A-Z]' THEN LEFT(@PostCode,4)+' '+RIGHT(@PostCode,3)
 ELSE @PostCode END)
  
 --Update @NewPostCode to old value if pattern does not match, or else use new value.
 DECLARE @NewPostCode VARCHAR(100)
 IF (CASE WHEN @PostCode LIKE '[A-Z][A-Z][0-9] [0-9][A-Z][A-Z]' THEN 1
 WHEN @PostCode LIKE '[A-Z][0-9] [0-9][A-Z][A-Z]' THEN 2
 WHEN @PostCode LIKE '[A-Z][0-9][0-9] [0-9][A-Z][A-Z]' THEN 3
 WHEN @PostCode LIKE '[A-Z][0-9][A-Z] [0-9][A-Z][A-Z]'THEN 4
 WHEN @PostCode LIKE '[A-Z][A-Z][0-9][0-9] [0-9][A-Z][A-Z]' THEN 5
 WHEN @PostCode LIKE '[A-Z][A-Z][0-9][A-Z] [0-9][A-Z][A-Z]' THEN 6
 ELSE 0 END)=0
 BEGIN
   SET @NewPostCode=@OldPostCode
 END
 ELSE
   SET @NewPostCode=@PostCode
 RETURN @NewPostCode
 END
Ousia Logo