Today most enterprise software development requires integrating with systems built by other teams and other companies. Often times those integrations need to coordinate data and processes which means there is a set of shared identifiers used by those systems to facilitate the integration.
When you are building an integration with another system, that returns and identifier like 10488084, do you store it as an integer, long, or a string?
With a few exceptions that most of us developers don’t really need to deal with, I would universally say it should be a string. Why should it be a string when it’s clearly a number?
- We aren’t performing any mathematical operations on it.
- We don’t have any control of the identifier, or the range of values.
- We likely don’t need to sort by ID, so whether the sort is numeric or alphabetic isn’t a concern.
- Storage space is cheap. We don’t need to worry about the efficiency of storing the value as string vs a numeric type.
- Performance on ID comparison is easily solved with data structures / indexes. Your database system is likely very capable at matching these strings quickly, even if it’s slightly slower then using numeric types.
Based on those reasons, any ID that is retrieved from a third party system should be stored as a string. Which protects your system from changes in the third party system. For example, if you store 10488084 as an integer in .NET, and then the third party decides all new id’s will start at 3000000000, you have a problem. The new ID’s will overflow and throw exceptions, unless you convert them to longs or strings throughout your application. So you might as well start with strings.
What are your thoughts? Any other reasons why we shouldn’t universally store third party ID’s as strings?