BIG DATA TERMS EVERYONE SHOULD KNOW

75 BIG DATA TERMS EVERYONE SHOULD KNOW Published by  RAMESH DONTHA · JULY 21, 2017 on Dataconomy.com  This article is a continuation of my first article, 25 Big Data terms everyone should know. Since it got such an overwhelmingly positive response, I decided to add an extra 50 terms to the list.  Just to give you a quick recap, I covered the following terms in my first article: Algorithm, Analytics, Descriptive analytics, Prescriptive analytics, Predictive analytics, Batch processing, Cassandra, Cloud computing, Cluster computing, Dark Data, Data Lake, Data mining, Data Scientist, Distributed file system, ETL, Hadoop, In-memory computing, IOT, Machine learning, Mapreduce, NoSQL, R, Spark, Stream processing, Structured Vs. Unstructured Data. Now let’s get on with 50 more big data terms. Apache Software Foundation Read More …

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

DATA SCIENCE SKILLS, AND HOW TO LEARN THEM

TOP 10 DATA SCIENCE SKILLS, AND HOW TO LEARN THEM Published by  EILEEN MCNULTY · DECEMBER 25, 2014 on Dataconomy.com One of most popular posts this year came from Ferris Jumah, a data scientist at LinkedIn, who mapped the most popular skills of data scientists by scraping LinkedIn profile data. One of the common comments amongst data scientists who came across this post- as with most of our posts focused around data science skillsets- was “Surely, you can’t expect data scientists to have all these skills?” Naturally, we don’t- every data science role involves a particular comibination of some of the skills, and anyone who had mastered all of the programming languages listed alone would be some sort of computing demi-God. Read More …

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

BIG DATA 101: INTRO TO PROBABILISTIC DATA STRUCTURES

BIG DATA 101: INTRO TO PROBABILISTIC DATA STRUCTURES CHRISTOPHER LOW · APRIL 17, 2017 on Dataconomy.com Oftentimes while analyzing big data we have a need to make checks on pieces of data like number of items in the dataset, number of unique items, and their occurrence frequency. Hash tables or Hash sets are usually employed for this purpose. But when the dataset becomes so enormous that it cannot fit inside the memory all at once, we need to use special kinds of data structures known as Probabilistic Data Structures. Streaming applications usually require data processing in one pass and then incremental updates. Fortunately, probabilistic data structures fit that processing model very well. Such data structures ignore collisions but errors are controlled Read More …

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

R data wrangling with DPLYR: Tutorial

R data wrangling with DPLYR: Tutorial with 50 samples Published on February 8, 2017 on LinkedIn.com by Michiel Victor Coming from the world of SQL and busy learning R? This is the bridging article. Written by Deepanshu Bhalla It’s a complete tutorial on data wrangling or manipulation with R. This tutorial covers one of the most powerful R package for data wrangling i.e. dplyr. This package was written by the most popular R programmer Hadley Wickham who has written many useful R packages such as ggplot2, tidyr etc. It’s one of the most popular R package as of date. This post includes several examples and tips of how to use dplyr package for cleaning and transforming data. What is dplyr? Read More …

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

Data Engineering & Data Science

Infographic: Data Engineering & Data Science Published on February 21, 2017 on LinkedIn.com  by Michiel Victor, Jake Moody If you’re interested in the field of analytics, you’ve probably heard the terms Data Engineering and Data Science, but do you know the difference? Although there has historically been considerable overlap between the two professions, they are each becoming more distinct. DataCamp created an infographic to help you understand the skills and responsibilities of each role. You’ll also get a chance to compare salaries, popular software and tools used by each, and some educational resources to help get you started! Author: Michiel Victor Data Architect | Data Analyst | Data Warehousing | Data Security | Database Performace | SQL | Reporting | Read More …

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn