Commercial real estate history changes. What was once true can change over time. This is not due to politically motivated conspiracies, or even neglect, but due to the nature of the raw data. Most statistics are derived from a set of properties, sampled on a regular basis, to produce a meaningful trend. Sometimes properties are added to the sample to keep up with changes in the structure of the market. Sometimes, these changes are substantial as in the addition of a large geographical area in a growing market. In such a case, the history of those properties must be feathered into the statistics, requiring a revision of the prior aggregations. While this is cumbersome for users of the data to incorporate, it is necessary in ever growing markets. The best sources of data will account for the changes clearly and sometimes even produce parallel sets of statistics for a few editions to allow the users to adapt to the revisions. You need to check yourself to detect many of the changes. A few simple tests in your collected data will expose some of the most common failings. One of my favorites is this:
Another common problem with time series data is the inclusion of increasing numbers of markets in national statistics. As the brokerage firms grew in size, they added the new markets to their statistics. This means that the definition of the nation changed over time. As a result, reported levels can not be used in variables such as construction completions, inventory, net absorption, and vacant space. It also means that ratios such as vacancy rates may be skewed as markets like Atlanta, with systemically high vacancy rates, are added. Worse are markets like Los Angeles which changed dramatically over the decades and have huge volumes of inventory. Unless you know which markets were added and when, it is impossible to repair the trend lines. Comparing erroneous time series to independent variables such as population, employment, or gross domestic product, leads to incorrect conclusions.
Finally, there are simple errors that you can make in compiling data from different sources. Pay special attention to the units of measure, and scrutinize the reports as they commonly include mistakes in this area. Other common issues include:
Many decades ago the computer age promised paperless offices. Most people's desks would indicate that promise is still unfulfilled. Still, we do access and use a lot more information than we did fifty years ago, and most of us use a lot less paper. One of the most powerful computing tools at our disposal is the relational database. Tables, holding fields of data for millions of records, allow us to quickly filter, sort, and aggregate statistics. Computer models use these statistics to generate forecasts and scenarios, output in graphs, charts, reports, and yes, even paper. Compiling data into these databases requires careful entry and consistent collection methodologies for the statistics and models to be valid.
Now the internet age is fully upon us. We can look up information on every conceivable subject from anywhere. Search engines like Google use sophisticated algorithms to deliver what we want, even if we are poor at describing it. But search engines are designed to provide us content to read and enjoy. How do we turn those results into data fit for a database? Services such as Fetch, Wolfram Alpha and Seaglex do an excellent job there, but don't think your work is over at that point. You still need to spend time training these tools, and be very careful on how you use the results. You still need to understand what you are putting into your models. That said, the power of databases and the internet come together in these tools, and you won't get a paper cut flipping through old publications.