One question that naturally arises when dealing with large amount of data (many of these objects) is: which data type is better to use to manage ids?
We indeed know that the chosen data type can make a big difference in terms of the performance of the pipeline that manages these data.
If we look at ids’ meaning, the most correct way to store them should be as Strings.
Ids, indeed, are not representing real numbers with mathematical operations and ordering, but they are labels usually representing real-world objects.
As we all know, however, Strings are expensive to store in memory and to process in time. Therefore, using Integers comes naturally in order to improve performance.