Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

Search

XML Feeds

Google Ads

« SQL Server 2008 Proximity Search With The Geography Data TypeKaspersky Web Site Hacked With SQL Injection, How Embarrassing Is This? »
comments
Rate Post:
submit to reddit Digg!FacebookDotnetkicks

Have you ever wanted to add functionality that allows you to perform simple radius searches based on Zip Code? For example, Show me all the cities within 20 miles of a certain zip code or show me the 5 closest retail locations. You can easily add this functionality to your database by importing a zip code database and writing a little code. This article will show you where to get zip code data, how to import it to your database, and how to use the data to perform proximity searches.

Denis and I decided to show you how you can do simple radius searches based on Zip Codes, he did the SQL 2008 version here: http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/sql-server-2008-proximity-search-with-th (using the geography data type) and I did the SQL Server 2000/2005 version.

The first thing we need to do is load our data. There are various sources for this data, ranging from free to expensive monthly subscriptions. One source of free data is: http://download.geonames.org/export/zip/

Once you have downloaded your data, the next step is to import it in to your database. You can use the following script to do it.

  1. IF Exists(SELECT *
  2.           FROM   Information_Schema.Tables
  3.           WHERE  Table_Name = 'ZipCodes'
  4.                  And Table_Type = 'Base Table')
  5.   DROP TABLE ZipCodes
  6.  
  7. CREATE TABLE ZipCodes(
  8.     [Country] [VARCHAR](2) NULL,
  9.     [ZipCode] [VARCHAR](5) NULL,
  10.     [City] [VARCHAR](200) NULL,
  11.     [STATE] [VARCHAR](50) NULL,
  12.     [StateAbbreviation] [VARCHAR](2) NULL,
  13.     [County] [VARCHAR](50) NULL,
  14.     [Unused1] [VARCHAR](5) NULL,
  15.     [Unused2] [VARCHAR](1) NULL,
  16.     [Latitude] [DECIMAL](8,5) NULL,
  17.     [Longitude] [DECIMAL](8,5) NULL,
  18.     [Unused3] [VARCHAR](1) NULL
  19.     )
  20.  
  21.  
  22. DECLARE @bulk_cmd VARCHAR(1000)
  23. SET @bulk_cmd = 'BULK INSERT ZipCodes
  24.    FROM ''C:\YourFolder\US.txt''
  25.    WITH (FIELDTERMINATOR=''\t'', ROWTERMINATOR = '''+CHAR(10)+''')'
  26.  
  27. EXEC(@bulk_cmd)
  28.  
  29. ALTER TABLE ZipCodes DROP COLUMN Unused1, Unused2, Unused3
  30.  
  31. CREATE CLUSTERED INDEX IX_ZipCodes_Zip ON dbo.ZipCodes(ZipCode, Longitude, Latitude)
  32. GO
  33. CREATE NONCLUSTERED INDEX IX_ZipCodes_Longitude_Latitude ON dbo.ZipCodes(Longitude,Latitude, ZipCode)
  34.  
  35. GO
  36.  
  37. SELECT * FROM ZipCodes

Next, we’ll need to add a couple of functions that will allow us to use this data.

The original function that calculates distances has been replaced with this one. There was a flaw in the original version. Based on rounding errors, the calculations fed in to the arc-cosine (acos) function could be greater than 1 or less than -1 which would cause a domain error. To accommodate this flaw, I perform the calculations in multiple steps so that I can catch the values that result in the domain error.

Credit for finding the flaw goes to Chris. Thank you.

  1. CREATE FUNCTION [dbo].[CalculateDistance]
  2.     (@Longitude1 DECIMAL(8,5),
  3.     @Latitude1   DECIMAL(8,5),
  4.     @Longitude2  DECIMAL(8,5),
  5.     @Latitude2   DECIMAL(8,5))
  6. RETURNS FLOAT
  7. AS
  8. BEGIN
  9. DECLARE @Temp FLOAT
  10.  
  11. SET @Temp = SIN(@Latitude1/57.2957795130823) * SIN(@Latitude2/57.2957795130823) + COS(@Latitude1/57.2957795130823) * COS(@Latitude2/57.2957795130823) * COS(@Longitude2/57.2957795130823 - @Longitude1/57.2957795130823)
  12.  
  13. IF @Temp > 1
  14.     SET @Temp = 1
  15. ELSE IF @Temp < -1
  16.     SET @Temp = -1
  17.  
  18. RETURN (3958.75586574 * ACOS(@Temp) )
  19.  
  20. END
  1. CREATE FUNCTION [dbo].[LatitudePlusDistance](@StartLatitude FLOAT, @Distance FLOAT) RETURNS FLOAT
  2. AS
  3. BEGIN
  4.     RETURN (SELECT @StartLatitude + SQRT(@Distance * @Distance / 4766.8999155991))
  5. END
  1. CREATE FUNCTION [dbo].[LongitudePlusDistance]
  2.     (@StartLongitude FLOAT,
  3.     @StartLatitude FLOAT,
  4.     @Distance FLOAT)
  5. RETURNS FLOAT
  6. AS
  7. BEGIN
  8.     RETURN (SELECT @StartLongitude + SQRT(@Distance * @Distance / (4784.39411916406 * COS(2 * @StartLatitude / 114.591559026165) * COS(2 * @StartLatitude / 114.591559026165))))
  9. END

Finally, we can begin writing some interesting queries. For example, return a list of zipcodes within 20 miles of zipcode 20013 (Washington, DC).

  1. -- Declare some variables that we will need.
  2. DECLARE @Longitude DECIMAL(8,5),
  3.         @Latitude DECIMAL(8,5),
  4.         @MinLongitude DECIMAL(8,5),
  5.         @MaxLongitude DECIMAL(8,5),
  6.         @MinLatitude DECIMAL(8,5),
  7.         @MaxLatitude DECIMAL(8,5)
  8.  
  9. -- Get the lat/long for the given zip
  10. SELECT @Longitude = Longitude,
  11.        @Latitude = Latitude
  12. FROM   ZipCodes
  13. WHERE  ZipCode = '20013'
  14.  
  15. -- Calculate the Max Lat/Long
  16. SELECT @MaxLongitude = dbo.LongitudePlusDistance(@Longitude, @Latitude, 20),
  17.        @MaxLatitude = dbo.LatitudePlusDistance(@Latitude, 20)
  18.  
  19. -- Calculate the min lat/long
  20. SELECT @MinLatitude = 2 * @Latitude - @MaxLatitude,
  21.        @MinLongitude = 2 * @Longitude - @MaxLongitude
  22.  
  23. -- The query to return all zips within a certain distance
  24. SELECT ZipCode
  25. FROM   ZipCodes
  26. WHERE  Longitude Between @MinLongitude And @MaxLongitude
  27.        And Latitude Between @MinLatitude And @MaxLatitude
  28.        And dbo.CalculateDistance(@Longitude, @Latitude, Longitude, Latitude) <= 20

You could also use this for a "store locator". On a website, potential customers could enter their zipcode and you could present a list of top 5 closest stores.
The query shown below assumes you have a Stores table with a ZipCode column.

  1. DECLARE @Longitude DECIMAL(8,5)
  2. DECLARE @Latitude DECIMAL(8,5)
  3.  
  4. SELECT  @Longitude = Longitude,
  5.         @Latitude = Latitude
  6. FROM    ZipCodes
  7. WHERE   ZipCode = '20013'
  8.  
  9. SELECT  TOP 5 Stores.StoreName, ZipCodes.City,  dbo.CalculateDistance(@Longitude, @Latitude, ZipCodes.Longitude, ZipCodes.Latitude) AS Distance
  10. FROM    @Stores AS Stores
  11.         INNER Join ZipCodes
  12.             ON Stores.ZipCode = ZipCodes.ZipCode
  13. ORDER BY dbo.CalculateDistance(@Longitude, @Latitude, ZipCodes.Longitude, ZipCodes.Latitude)

As you can see, these queries perform very well, and are relatively easy to write.

About the Author

George has been developing software professionally for 19 years, first for the department of defense, and then for various other companies. In 1998, George started his software company, Orbit Software, specializing in School Bus Transportation software. His specialty is refining SQL Server queries to deliver optimal performance.
Social SitingsTwitterLTD RSS Feed
17356 views
submit to reddit Digg!FacebookDotnetkicks

Comments and Feedback

34 comments

Comment from: Alex Ullrich [Member] Email
*****
I haven't used your new function (I will have to do another test) but in SQL 2005 you can also use the CLR functions very effectively for calculations like this:

http://www.alexcuse.com/BlogEntry.aspx?PostID=55

I'm taking a better look at Denis' 2008 method right now, I want to test all three later. I'd like to take a look at the new function against the other one (which I stole from you as well) and see how they stack up too.
02/12/09 @ 19:39
Comment from: SQLDenis [Member] Email
*****
Alex maybe you should create a post showing how to do it with the CLR
02/13/09 @ 04:34
Comment from: Chris [Visitor] Email
*****
Thanks for sharing this. I'm trying it out now.

I went looking for information on how to calculate distances using longitude and latitude using kilometers and came across this page: http://www.meridianworlddata.com/Distance-Calculation.asp

On there they list the following values that could be substituted for the first value in your CalculateDistance function:
3437.74677 (nautical miles)
6378.7 (kilometers)
3963.0 (statute miles)

You used a value of 3958.75586574 - how did you arrive at your value?

I might suggest adding another input to the CalculateDistance function to allow for results to be generated in different units of measure (miles vs. kilometers). I've done the same here but thought others could benefit from the same.

Thanks again.
02/13/09 @ 08:35
Comment from: Alex Ullrich [Member] Email
Denis, its' in the link. I guess I could post it here too but that would not be very DRY of me ;)

Actually it could be helpful for others to have all three in one place (then we could do a wrap-up post testing them all). I'll revisit the CLR version when I get a chance, or if I'm lazy just take the relevant bits out of that post.

Expect to see it next week. I'll stop rambling now!
02/13/09 @ 09:44
Comment from: Chris [Visitor] Email
*****
I worked with the code today and have a few more thoughts:

I downloaded the source zip code data using the link you provided. You should know that there are a small number of instances when a single zipcode has multiple rows in the table. I believe this happens when a zipcode spans multiple cities. For my uses I’m picking the first record from each zipcode so as not to duplicate my results.

I did run into some errors:
Trying to calculate the distance between ZipCode 11779 and itself throws the following error in SQL Server 2000:
“A domain error occurred.”
This doesn’t happen for all zipcodes, though. Most seem to produce a very small distance difference and round out to 0. Handling matching zipcodes up-front would take care of this. I’ve also seen cases where two zip codes have the same coordinates (e.g., 72035 and 72033) and this can cause the same error.
02/13/09 @ 16:17
Comment from: George Mastros [Member] Email
Chris,

It well known that some zip codes span multiple cities/towns. It's also well known that some cities have multiple zip codes.

I figured out what causes the domain error. The CalculateDistance function uses the arc-cosine (acos) function. It's well documented in books on line that acos returns a domain error if you supply a value less than -1 or greater than +1. Identification is half the battle, right. I will post back again with a solution to this problem.

Thank you very much for bringing this to my attention.
02/13/09 @ 16:35
Comment from: George Mastros [Member] Email
Chris,

I modified the function to accommodate this problem. I edited the blog to replace the flawed version of the function.

Thanks again for pointing out the problem.
02/13/09 @ 16:49
Comment from: Dale R. Whitaker [Visitor] Email
*****
I like this and admire what you shared with others.
02/15/09 @ 06:09
Comment from: Doug [Visitor]
****-
Since @Distance is squared, how does one ask for a logitude 5 miles less than the one specified?
It seems like the distance calculations are not valid fro negative numbers, true?
03/31/09 @ 12:30
Comment from: JayZ [Visitor]
Very nice work, you saved me a lot of time.

Thanks you!
05/05/09 @ 08:57
Comment from: Henryc [Visitor]
*****
Hello all,
Nice work. We are working on an application where
2 zip codes are entered in the same query and we want to have the option of varying the mileage radii around each zip code entered and we need to return a list of all the zip code values found in each radius set.
For example, enter 30301 for origin and 50008 for destination and 20 mile radius around origin and 25 mile radius around the destination. Each record set stored in the database will contain origin and destiation zip codes. The table of zip code values will be compared to zip code table and any records where the origin and destination zip codes in the stored table that are included in the 2 sets will return these records, others will be excluded. Can you give any suggestions of the best way to construct this?
Thanks
05/14/09 @ 21:40
Comment from: rod marsh [Visitor]
*****
Great work. Thank you very much. Just incorporated this into a complicated search query and it worked a treat. A++
05/19/09 @ 03:39
Comment from: Smoker [Visitor]
*****
Can Someone tell me what's the exact formula to calculate maximum latitude
and longitude
06/30/09 @ 23:34
Comment from: George Mastros [Member] Email
@Smoker,

Your question confuses me. Normally, Longitude has a range of -180 to +180 and Latitude has a range -90 to 90.

Since these are constants, no formula is needed.

I suspect your question is more complicated than this. As such, I encourage you to post a question on our forum, and I will be glad to help you out. Please post your question here:

http://forum.lessthandot.com/viewforum.php?f=17
07/01/09 @ 05:37
Comment from: Smoker [Visitor]
@George

Thanks for quick reply. I understand the range values but I want to know how
you have calculated the values like 4784.39411916406 , 114.591559026165 ,
4766.8999155991 in the LatitudePlusDistance and LongitudePlusDistance.
07/01/09 @ 08:56
Comment from: Maria [Visitor]
*****
Hi George,

I was looking for the same kind of proximity search code base and found it here.
Great Work. But In Functions LatitudePlusDistance and LongitudePlusDistance
some Precalculated values are used like 4784.39411916406 , 114.591559026165 ,
,4766.8999155991 which I am unable to understand. Can you please tell us how you have calculated those values

Thanks In Advance,
Maria
07/02/09 @ 20:35
Comment from: Kris Boldt [Visitor] · http://www.naturalsolutionsmag.com
*****
Thanks for the code. It worked perfectly.
I was looking for a solution for this type of function and was having a hard time until I found this post.

Great job George!

07/30/09 @ 15:51
Comment from: John [Visitor]
*****
This is great! Thanks so much for sharing.
08/01/09 @ 14:53
Comment from: Lou [Visitor]
*****
I cannot thank you enough for posting this...I'm working on an iPhone app to go along with an ASP-driven site that we have and I had no idea how to do this until I was directed to this article.
08/29/09 @ 09:36
Comment from: Michael [Visitor]
*****
I used your setup and things for the most part appear the way they should. However, I have several that don't seem to add up.

In comparing the results with Google Maps, some locations are off by a factor of 7.

For instance: http://www.google.com/maps?f=d&source=s%5Fd&saddr=10715+Downsville+Pike+Ste+100+MD+21740&daddr=444+East+College+Ave+Ste+120+State+College+PA,+16801&geocode=&hl=en&mra=ls&sll=40.799159,-77.856052&sspn=0.008836,0.01826&g=444+East+College+Ave+Ste+120+State+College+PA,+16801&ie=UTF8&t=h&z=8

Which uses these coordinates:
long/lat
40.799159,-77.856052
40.22018,-78.140205

Google says the distance (driving) is 144 miles vs the query saying ~21 miles (as the crow flies).

I've looked up other people's implementations and their calculations are practically the same. Such as: http://www.movable-type.co.uk/scripts/latlong.html

Any ideas?
10/21/09 @ 20:11
Comment from: George Mastros [Member] Email
Michael,

The calculate distance function I present here accepts 4 parameters. It's important that you send the data in to the function in the correct order. Using your coordinates, if you pass lat,long,lat,long you will get 21 miles. If you pass long,lat,long,lat (the correct order of the parameters) you get 42.7 miles.

Using Google Earth (it has a straight line ruler tool), the distance between your coordinates is reported as 42.61. The difference between Google Earth and my calculations is approximate 475 feet over the course of 42 miles (or 0.2% error).

By the way, your coordinates are wrong too. They should be:
40.798173,-77.856637
39.607699,-77.753558

My distance calculation formula reports this as 82.4 miles (crow fly). Google earth reports 82.2 miles. Following a street network, Google earth reports 144 miles.
10/22/09 @ 06:40
Comment from: Dhirendra Singh [Visitor]
Great job George!
Thanks for the code. It worked perfectly.
10/26/09 @ 04:15
Comment from: Emtucifor [Member] Email
For those who have been asking, the various constants used in George's distance calculation come from:
- Conversions to make different units of measure work: kilometers, miles, etc.
- The nonuniform, ovoid shape of the Earth requiring different circumferences for equatorial vs. meridional calculations.
- The particular Earth geometry system used. Some of these are NAD 27, NAD 83, WGS 84--see http://www.ngs.noaa.gov/faq.shtml .
12/14/09 @ 17:13
Comment from: Naomi [Member] Email
George,

Why you're using FLOAT for the @Temp?
04/13/10 @ 10:54
Comment from: shankar [Visitor]
Thanks for this script. Helped us.
06/21/10 @ 01:31
Comment from: Kevin [Visitor]
I am having an issue during the import process. It seems that some of the data cannot be converted to Decimal(8, 5) and the bulk import fails because of this in the Longitude column

Any thoughts?
07/08/10 @ 08:49
Comment from: George Mastros [Member] Email
Kevin, can you tell me what you used for the source of your data so that I can (hopefully) reproduce the problem and fix it?
07/08/10 @ 09:19
Comment from: Pollirrata [Visitor] Email
Great work!! It's really nice having something like this...

@Kevin
I faced some the same problem inserting Mexico's data 'cause the order of columns in txt file was different from script and our state abbreviations can be up to 4 characters... Just checking the file content worked for me "fixing" this issue... maybe this could help you....
07/08/10 @ 13:38
Comment from: RoyHink [Visitor]
Great work, and Thank You for posting this!

I am a web coder (PHP/JS/Python) and know just enough SQL to do Selects, Updates and Joins.

I apologize upfront if I'm missing something basic and commonly understood!

I am running MSSQL 2000 on a WIN2003 R2 server

This code is throwing the following errors in Query Analyzer:

Server: Msg 156, Level 15, State 1, Procedure CalculateDistance, Line 21
Incorrect syntax near the keyword 'CREATE'.
Server: Msg 111, Level 15, State 1, Procedure CalculateDistance, Line 21
'CREATE FUNCTION' must be the first statement in a query batch.
Server: Msg 137, Level 15, State 1, Procedure CalculateDistance, Line 24
Must declare the variable '@StartLatitude'.
Server: Msg 111, Level 15, State 1, Procedure CalculateDistance, Line 26
'CREATE FUNCTION' must be the first statement in a query batch.
Server: Msg 137, Level 15, State 1, Procedure CalculateDistance, Line 33
Must declare the variable '@StartLongitude'.

Would you be willing to give me some direction?

Thanks
07/30/10 @ 11:46
Comment from: SQLDenis [Member] Email
Roy, you need to either run each create function/create proc by itself or you need to put GO after the code block


CREATE FUNCTION [dbo].[LatitudePlusDistance](@StartLatitude FLOAT, @Distance FLOAT) RETURNS FLOAT
AS
BEGIN
RETURN (SELECT @StartLatitude + SQRT(@Distance * @Distance / 4766.8999155991))
END
GO -- -- add this

CREATE FUNCTION [dbo].[LongitudePlusDistance]
(@StartLongitude FLOAT,
@StartLatitude FLOAT,
@Distance FLOAT)
RETURNS FLOAT
AS
BEGIN
RETURN (SELECT @StartLongitude + SQRT(@Distance * @Distance / (4784.39411916406 * COS(2 * @StartLatitude / 114.591559026165) * COS(2 * @StartLatitude / 114.591559026165))))
END
07/30/10 @ 13:17
Comment from: George Mastros [Member] Email
Thanks for posting Denis. This is exactly what I would have suggested.
07/30/10 @ 14:10
Comment from: RoyHink [Visitor]
Thanks, SQLDenis, that worked perfectly.
07/30/10 @ 15:31
Comment from: Mike Payne [Visitor] Email
I was having a problem (Domain Error) with my own query and ran into this site trying to fix it.

I definitely like this method better.

However, it does not return consistent results as it sits.

I noticed that a comparison of the two queries (on a zip code not producing the domain error) returned slightly different result sets.

Even though one zip code was coming back within 4.7 miles in a 5 mile radius search - this query was not returning it.

I ended up multiplying the Distance by 1.1 and restricting the results without the addition of .1

This returned identical results. I think that this is a problem within the MAX/MIN Lat/Long functions.

This "quick fix" returned the right results for me - but I figured I would let you know, in case you wanted to research it further.

Thanks for the query either way.
08/03/10 @ 14:17
Comment from: ashwani01 [Member] Email
IT Professionals are responsible for creating new technology for development.
08/05/10 @ 01:25

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)