Experienced statisticians — the least sexy of titles given to people who explore data — are quick to inform the eager apprentice that most of their time will be spent finding, cleaning, and preparing data. The analysis part — that is, the part that feels the most like panning for gold — is a very small fraction of the job.
An insidious assumption exists, promoted by software vendors, that knowing how to use a particular data analysis software product “auto-magically” imbues one with the skills of a data analyst. Even with good software—something that’s rare—this is far from true. Just as with any area of expertise, data analysis requires training and practice, practice, practice.
… it’s not about having the data, but about the ideas and computational follow-through needed to make use of it …
When’s the last time a city did something so exciting that people from every walk of life and every part of town were talking about it? That’s the reaction Google Fiber sparked in Kansas City, and now the excitement — and electrical current of fiber-to-the-home connections — will reach Austin, Texas.
It’s a good question.
Affordable housing. That’s not a very sexy answer, but it’s true. If we had universal healthcare and affordable housing, people could be more creative, sleep better at night, and live longer.
If you have any Denton tips or food recommendations, hit me up on twitter: @austinkleon
Any data scientist worth their salary will tell you that you should start with a question, NOT the data. Unfortunately, data hackathons often lack clear problem definitions. Most companies think that if you can just get hackers, pizza, and data together in a room, magic will happen. This is the same as if Habitat for Humanity gathered its volunteers around a pile of wood and said, “Have at it!” By the end of the day you’d be left with a half of a sunroom with 14 outlets in it.
Transparency can be a powerful thing, but not in isolation. So, let’s stop passing the buck by saying our job is just to get the data out there and it’s other people’s job to figure out how to use it. Let’s decide that our job is to fight for good in the world.
“An open and transparent administration makes it easier for residents to hold their government accountable, but it also serves as a platform for innovative tools that improve the lives of all residents,” said Mayor Emanuel, in statement on the city website.
“Chicago’s vibrant technology and startup community will leverage this wealth of open, public data to create applications that will improve service delivery and lead to greater quality of service for residents and more public engagement in City government.”
The city released 21 new “high value” datasets today, including real-time traffic data from Chicago Transit Authority (CTA) buses, environmental data, liquor regulation, and recycling programs.
When asked what made these datasets high value, the Mayor’s Office responded via email.
“The datasets released today aren’t necessarily more critical than the more than 400 others that have been released,” wrote Caroline Weisser, a spokesperson for the Mayor’s Office.
“They continue the commitment the administration has taken to being a leader in municipal open data. The executive order itself codifies the actions that Brett and John Tolva, the CTO, have taken over the past year and a half to pursue both open data policy and detailed analytics in tandem. Making a firm commitment to continue adding writable data to the dataportal about how the city works provides the raw materials for the City to collaborate and innovate with the developer community, which ultimately helps the City do a better job of serving Chicagoans.”
For more context on opening government, the Chicago way, read our feature from 2011 and more recent coverage of how Brett Goldstein, Chicago’s chief information officer and chief data officer, is using data in the public sector.