Learn R programming from Edvancer.Python Python is a general purpose, intuitive and easy to learn programming language. It has many famous data analysis libraries like Numpy, Pandas, Scipy, Scikit-Learn, which makes it popular in the data science community. I would recommend you use Ipython notebook as your programming environment to perform data analysis in Python. You can either install Anaconda or Enthought Canopy; both these packages come with pre-installed libraries. Famous banks like Bank of America and JP Morgan use Python to build new products and crunch financial data.
Learn how to use Python for data science from Edvancer.Julia Normally, data geeks use one programming language (like R, Python, etc.) to prototype a predictive model and another programming language (like C, C++) to make the model faster. You need to learn two or three programming languages, write a significant amount of code and switch between different code editors and source files to deploy a working predictive model. This is a cumbersome task, and takes more time than any data scientist can afford to waste. In Julia, you can write code with the performance of C so that you don’t have to rewrite its code in a low-level language (like C, C++). Julia’s only drawback at this point is a dearth of libraries – but Julia makes it easy to interface with existing C libraries. I encourage you to download Julia and use it, it has an active and supportive community. Hadoop Hadoop platform was designed to solve problems where you have a lot of complex data sets that doesn’t fit into a traditional relational database. Hadoop is a distributed file system (HDFS) — with multiple nodes/servers– that helps businesses store unstructured data in vast volumes, at speed and on commodity hardware, at a very low cost. HDFS uses a programming model called Map Reduce to access and analyze the data in it. Map reduce process all the data on all the nodes.