Hi,
What's the most efficient way to store the following information:
* Table contains 1 million listings
* Each listing can be geo-targeted to any of the 200+ countries
* Searches return listings based on geo-location
Storage options:
Option #1 (normalized)
* ListingsTable (PK listingID int) [1 million rows]
* ListingGeoLocations (listingID, geoLocationID) [could be up to 200 million rows]
Option #2 (denormalized)
* ListingsTable (PK listingID int, binary(32) with bit-mask consisting of 200 bits one for each location)
Did anyone have experience with similar structures? Which option is more efficient?
Thanks,
Avoption 2 sounds inferior to me
let's say you were looking for listings corresponding to the 37th bit in the bitmask, how would you find them?
by inspecting the bitmask of all 1 million rows, that's how|||I'm too tired to answer
Look up normalization on google|||Option #1.
As Rudy pointed out, you are going to have problems doing bitwise calculations on bitmasks over 30 bits, after which you exceed the capacity of the BigInt datatype.
Also, while under scenario #1 your table COULD reach 200 million rows, in reality what is the average number of countries you expect for each geolocation? Probably significantly less than 200.|||I understand that option #1 is normalized and will allow the search for listings in a specific country using query. However, say when a search is performed, it is based on an set of keywords (which are also tied to listings).
So say you looking for "dentists". You can about 100 listings that are related. Now, the APPLICATION (not SQL) will simply "filter" the 100 listings by doing geo-specific bitwise comparissons and will only display the listings that have the proper country bit set.
Still Option #1? The search needs to be very fast.|||You are NOT going to do bitwise operations on a 200 value bitmask.|||So say you looking for "dentists". You can about 100 listings that are related. you can? you can what? how does that work?|||you can? you can what? how does that work?
Ask Harry potter|||Why not have a
Listings table (Pk ListingID)
Locations Table (PK LocationID)
and,
ListingLocations table (PK ListingID, LocationID)
ListingLocations is an intersection table, used to define a many-to-many relationship. The two required fields, ListingID and LocationID are also foreign keys to the appropriate table.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment