Edvancer's Knowledge Hub

6 steps taken in data analysis

6 steps taken in data analysis

In this post, I’ll break down the process of data modeling into steps and look at each one separately, but before that I’ll be defining it. What is data analysis? You have to know exactly what data analysis is before you can understand the process. Analysis of data is the procedure of first of all setting goals as to what data you need and what questions you’re hoping it will answer, then collecting the information, then inspecting and interpreting the data, with the aim of sorting out the bits that are useful, in order to suggest conclusions and help with decision making by various users. It focuses on knowledge discovery for predictive and descriptive purposes, sometimes discovering new trends, and sometimes to confirm or disprove existing ideas. Actions taken in the Data Analysis Process  Business intelligence requirements may be different for every business, but the majority of the underlined steps are similar for most: Step 1: Setting of goals  This is the first step in the data modeling procedure. It’s vital that understandable, simple, short, and measurable goals are defined before any data collection begins. These objectives might be set out in question format, for example, if your business is struggling to sell its products, some relevant questions may be, “Are we overpricing our goods?” and “How is the competition’s product different to ours?” Asking these kinds of questions at the outset is vital because your collection of data will depend on the type of questions you have. So, to answer your question, “How is the competition’s product different to ours?” you will need to gather information from customers regarding what it is they prefer about the other company’s product and also launch an investigation into their product’s specs. To answer your question, “Are we overpricing our goods?” you will have to gather data regarding your production costs, as well as details about the price of similar goods on the market. As you can appreciate, the type of data you’ll be collecting will differ hugely depending on what questions you need to be answered. Data analysis is a lengthy and sometimes costly procedure, so it’s essential that you don’t waste time and money by gathering data that isn’t relevant. It’s vital to ask the right questions, so the data modeling team knows what information you need. Step 2: Setting priorities for measurement   Once your goals have been defined, your next step is to decide what it is you’re going to be measuring, and what methods you’ll use to measure it. Determine what you’re going to be measuring At this point, you’ll need to determine exactly what type of data you’ll be needing to answer your questions. Let’s say you want to answer the question, “How can we cut down on the number of people we employ without a reduction in the quality of our product?” The data you’ll need will be along these lines: the number of people the business is currently employing; how much the business pays these employees each month; other benefits the employees receive that are a cost to the company, such as meals or transport; the amount of time these employees are currently spending on actually making the product; whether or not there are any redundant posts that have may have been taken over by technology or mechanization. As soon as the data surrounding the main question has been obtained, you’ll need to ask other, secondary, questions about the main one, such as, “Is every employee’s potential being used to the maximum?” and “Are there perhaps ways to increase productivity?” All the data that’s gathered to answer the main questions and these secondary questions can be converted into useful information that will assist your company in its decision making. For instance, you may in the light of what is found decided to cut a few posts and replace some workers with machines. Choose a measurement It’s vital that you choose the criteria that’ll be utilized in the measurement of the data you’re going to collect. The reason is that the way in which the data is collected will determine how it gets analyzed later. You need to be asking how much time you want to take for the analysis of the project. You also need to know the units of measurement you’ll be using. For example, if you market your company’s product overseas, will your money measurements be in dollars or yen?  Regarding the employee question we discussed earlier, you would, for example, need to decide if you’re going to take the employees’s bonuses or their safety equipment costs into the picture or not. Step 3: Data Gathering The next phase of the data modeling procedure is the actual gathering of data. Now that you know your priorities and what it is that you’re going to be measuring, it’ll be much simpler to collect the information in an organized way. There are a few things to bear in mind before gathering the data: Check if there already is any data available regarding the questions you have asked. There’s no point in duplicating work if there already is a record of, say, the number of employees the company has. You will also need to find a way of combining all the information you have. Perhaps you’ve decided to gather employee information by using a survey. Think very carefully about what questions you put onto the survey before sending it out. It’s preferable not to send out lots of different surveys to your employees, but to gather all the necessary details the first time around. Also, decide if you’re going to offer incentives for filling out the questionnaires to ensure you get the maximum amount of cooperation. Data preparation involves gathering the data in, checking it for accuracy, and entering it into a computer to develop your database. You’ll need to ensure that you set up a proper procedure for logging the data that’s going to be coming in and for keeping tabs on it before you can do the actual analysis. You might have data coming in from different places, such as from your survey, from employee interviews, or from observational studies, and perhaps from past records like payrolls. Remember to screen the information for accuracy as soon as it comes in, before logging it. You may need to go back to some of the employees for clarification. For instance, some of the replies on the questionnaires may not be legible, or some may not be complete. If you’ve gathered data to analyze if your product is overpriced, for instance, check that the dates have been included, as prices and spending habits tend to fluctuate seasonally. Step 4: Data Scrubbing Data scrubbing, or cleansing, is the process where you’ll find, then amend or remove any incorrect or superfluous data. Some of the information that you’ve gathered may have been duplicated, it may be incomplete, or it may be redundant. Because computers cannot reason as humans can, the data input needs to be of a high quality. For instance, a human will pick up that a zip code on a customer survey is incorrect by one digit, but a computer will not. It helps to know the main sources of so called “dirty data.” Poor data capture such as typos are one, lack of company-wide standards, missing data, different departments within the company each having their separate databases, and old systems containing obsolete data, are a few others. There are data scrubbing software tools available, and if you’re dealing with large amounts of incoming information, they can save your database administrator a lot of time. For instance, because data has come in from many different sources like surveys and interviews, there is often no consistent format. As an example, there needs to be a common unit of measurement in place such as feet or meters, dollars or yen. The process involves identifying which data sources are not authoritative, measuring the quality of the data, checking for incompleteness or inconsistency, and cleaning up and formatting the data. The final stage in the process will be loading the cleaned information into the log or “data warehouse” as it’s sometimes called. It’s vital that this process is done, as “junk data” will affect your decision making in the end. For instance, if half of your employees didn’t respond to your survey, these figures need to be taken into account. Finally, remember the data scrubbing is no substitute for getting good quality data in the first place. Step 5: Analysis of data Now that you have collected the data you need, it is time to analyze it. There are several methods you can use for this, for instance, data mining, business intelligence, data visualization, or exploratory data analysis. The latter is a way in which sets of information are analyzed to determine their distinct characteristics. In this way, the data can finally be used to test your original hypothesis. Descriptive statistics is another method of analyzing your information. The data is examined to find what the major features are. An attempt is made to summarize the information that has been gathered. Under descriptive statistics, analysts will use some basic tools to help them make sense of what sometimes amounts to mountains of information. The mean or average of a set of numbers can be used. This helps to determine the overall trend and is easy and quick to calculate. It won’t provide you with much accuracy when gauging the overall picture, though, so other tools are also used. Sample size determination, for instance. When you measure information that has been gathered from a large workforce, for example, you may not need to use the information from every single member to get an accurate idea. Data visualization is when the information is presented in visual form, such as graphs, charts, and tables or pictures. The main reason for this is to communicate the information in an easily understandable manner. Even very complicated data can be simplified and understood by most people when represented visually. It also becomes easier to compare the data when it’s in this format. For example, if you need to see how your product is performing compared to your competitor’s product, all the information such as price, specs, how many were sold in the last year can be put into graph or picture form so that the data can be easily assessed and decisions made. You will quickly see that your prices are higher overall than those of the competition, and this will help you identify the source of the problem. Basically, any method can be used, as long as it will help the analyst to examine the information that has been collected, with the goals in mind of making some sense out of it, to look for patterns and relationships, and help answer your original questions. The data analysis part of the overall process is very labor intensive. Statistics need to be compared and contrasted, looking for similarities and differences. Different researchers prefer different methods. Some prefer to use software as the main way of analyzing the data, while others use software merely as a tool to organize and manage the information. Step 6: Result interpretation Once the data has been sorted and analyzed, it can be interpreted. You will now be able to see if what has been collected is helpful in answering your original question. Does it help you with any objections that may have been raised initially? Are any of the results limiting, or inconclusive? If this is the case, you may have to conduct further research. Have any new questions been revealed that weren’t obvious before? If all your questions are dealt with by the data currently available, then your research can be considered complete and the data final. It may now be utilized for the purpose for which it was gathered- to help you make good decisions. Interpret the data precisely It is of paramount importance that the data you have gathered is meticulously and carefully interpreted. It’s extremely vital that our company has access to experts who can give you the correct results. For instance, perhaps your business needs to interpret data from social media such as Twitter and Instagram. An untrained person will not be able to correctly analyze the significance of all the communication regarding your product that happens on these sites. It is for this reason that most businesses nowadays have a social media manager to deal with such information. These managers know how the social platforms function, the demographic that uses them, and they know how to portray your company in a good light on them as well as extract data from the users. For every company to be successful, it needs people who can analyze incoming data correctly. The amount of information available today is bigger than it has ever been, so companies need to employ professionals to help stay on top of it all. This is particularly true if the founders of a company don’t have much knowledge of data. It would then be a great idea to bring an analyst into the team early. There is so much strategic information to be found in the data that a company accumulates. An analyst can help you decide what parts of the information to focus on, show you where you are losing customers, or suggest how to improve your product. They will be able to suggest to management which parts of the data need to be looked at for decisions to be made. For instance, a trained data analyst will be able to see that a customer initially “liked” your product on Facebook. He then Googled your product and found out more about it. He then ordered it online and gave a positive review on your website. The analyst can trace this pattern and see how many other customers do the same. This information can then perhaps help your business with advertising, or with expansion into other markets. For instance, the analyst can collect data regarding whether putting graphics with “tweets” increases interest, and can tell what age group it appeals to more. They’ll be able to tell you what marketing techniques work best on the different platforms. It is hoped that from this you can see how vital data collection and analysis are for the well-being of your company, and how it can help in all departments of your business, from customer care, to employee relations, to product manufacture and marketing.

Manu Jeevan

Manu Jeevan is a self-taught data scientist and loves to explain data science concepts in simple terms. You can connect with him on LinkedIn, or email him at manu@bigdataexaminer.com.
Manu Jeevan
Share this on

Follow us on
Author :
Free Data Science & AI Starter Course

Enrol For A Free Data Science & AI Starter Course

Learn R, Python, basics of statistics, machine learning and deep learning through this free course and set yourself up to emerge from these difficult times stronger, smarter and with more in-demand skills! In 15 days you will become better placed to move further towards a career in data science. Upgrade to the specialization programs at attractive discounts!

Don't Miss This Absolutely Free, No Conditions Attached Course