In the last 20 years, IT has been reshaped by numerous trends related to the improvement of the internet. Cloud computing, smartphones and the Internet of Things have all been enabled by ubiquitous IP network connectivity. Just between 2011 and 2016, the average home internet speed more than tripled, from 10 Mbps to 31 Mbps, according to the Federal Communications Commission.
Such rapidly increasing speeds - which should continue to climb as innovations such as gigabit fiber enter the mainstream - have fueled the popularity of everything from on-demand music streaming to vast cloud-managed databases. The latter systems are now essential components of the field known as data science (alternatively "big data" or "data analytics"), which only emerged in the late 1990s.
Data science: A new frontier in internet-centric innovation
What is data science? We can loosely define it as a cross between mathematics (especially statistics, but also algebra and calculus to lesser degrees) and computer programming. A practitioner, known as a data scientist, will typically manipulate vast datasets using a combination of statistical techniques and technical platforms, which might run the gamut from basic spreadsheets to cutting-edge cloud frameworks. The ultimate goal is to draw actionable insights from all the information at your fingertips.
If you want to become a data scientist, what should you do? It is undoubtedly one of the most alluring career paths at the moment for anyone interested in IT. Google chief economist Hal Varian labeled in the hottest career path of the upcoming decade in 2009, while the U.S. Bureau of Labor Statistics projected 11 percent growth (faster than average) in employment for computer information research scientists between 2014 and 2024.
Any prospective data scientist will need to be fluent in several specific applications and frameworks, many of which you are likely to encounter during the course of obtaining common IT certifications from vendors such as Microsoft and trade associations like CompTIA. Let's take a look at some of the key tools to know in this field.
Excel fits the classic mold of "easy to learn, hard to master." Its ubiquity in enterprise software (as part of Office 365, the business world's favorite cloud computing platform) makes it an obvious point of entry for data manipulation tasks, as does its ease of use for non-programmers. Advanced formulas and chart creation features further cement its utility within the sphere of data science. An Excel certification can help you learn about pivot table, Excel programming and other functions that unlock the spreadsheet application's power.
Infrastructure-as-a-Service platforms (Azure and AWS)
IaaS is integral to data science because it provides IT resources on a vast scale and on demand. The virtually limitless computing power, storage capacity and networking capabilities of an IaaS cloud greatly simplify the process of extracting insights. Unsurprisingly, the O'Reilly survey takers pointed to specific IaaS components such as Amazon Redshift as key tools in their work.
Although there are many nominal players in the IaaS market, a few vendors account for the vast majority of all IaaS spending: Amazon Web Services leads the pack, while Microsoft Azure has long been the runner-up and well ahead of would-be competitors such as Google and IBM. The specific features of each platform differ, but the fundamental technical skills needed for managing them are related. Accordingly, an AWS expert might pursue specifically designed coursework in Azure to acquire the additional expertise for working with his or her company's cloud implementation.
Linux and Windows
The Linux kernel is at the heart of numerous open source operating systems and tools, due to its easy availability, robust design and high flexibility. In the O'Reilly survey, Linux was the second most common OS in use by respondents, trailing only Windows. Almost half (49 percent) of them reported using Linux either alone or in tandem with other OSes, including Windows and macOS.
"Linux has a large ecosystem of compatible tools."
Linux has a large ecosystem of compatible tools, and it is commonly used in data science alongside Apache Spark and Hadoop as well as Python, all of which are open source. In contrast, Windows has its own different constellation of supporting big data apps and frameworks, which usually includes the aforementioned Excel in addition to Microsoft SQL Server and Windows Server.
Which OS a company uses will depend heavily on its size, budget and key applications. Many organizations choose to use both Linux and Windows to varying degrees, due to the unique benefits of each ecosystem. Certifications such as CompTIA Linux+, which helps with the installation and maintenance of Linux-based systems, and Office 365 can help you stay ahead of the curve for both OSes.
Get on the track to a data science career by earning IT certifications
There is plenty to learn in becoming an expert data scientist, given the need for skills that span technical programming, project management and mathematical analysis. Obtaining IT certifications from a trusted institution such as New Horizons Computer Learning Centers is a practical step on the road to data science success.
First, look for a location near you to plan your visit. Then, be sure to take a glance at our full course listings and our supporting resources like our webinars page, where you can learn more about how to look for IT careers that interest you. Data science is positioned to be one of the most lucrative career tracks for the rest of the decade, so it will literally pay to get the training you need to stand out from the pack of job applicants.