Excel

Why Do We Use Excel?

Efficiency.

Excel proves itself to be an efficient and easy to use platform in that we can easily transfer the data created in SQL to Excel. Data can be imported as what is known as a CSV (comma separated values) and retain the exact design as what was seen in SQL in terms of column headers and their respective information.

Visuals.

Not only can we easily transfer data platform to platform but Excel is extremely customizable with color schemes and font styles. SQL and other database managements systems show us the data stripped down to the numbers and characters. Excel has the power to create integrated graphs, tables, dashboards, and colorful table schemes which we will see. In my eyes (excuse the pun), visual effects on data can make a presentation more exciting and memorable with the related data.

Analysis.

From the company data consisting of Employees, Branches, Supplier and Product information, and Sales the information that stands out to an Analyst is the Sales data. The sales data appeals to us because that is what is keeping this JM Sports company afloat. Every company needs a stream of income to stay in business so it would make sense to analyze the information that corresponds to that concept. We are analyzing the Sales but what about it? Since there are 3 branches, 41 employees, 5 products, and 3 different months in the dataset this makes for many combinations to analyze. We can look at which branch outperformed others, or which branch for which product during what month outperformed the others. With the amount of subcategories created, there are many opportunities and combinations of factors to analyze to reach different conclusions that may be extremely valuable to tell us where or where not to allocate corporate interest down the road.


Formula Creation

2019-07-22%2B%25283%2529.jpg

Here is our Sales data which was created in SQL, however there are quite a few columns that have been added to this table. As we know, the Sales table from SQL consists of the employee ID, product ID, quantity of the product sold, and the month sold but this has been modified to have the Product Price, Revenue and Profit Margin for each sale. How have these columns been integrated into our sales table? These new columns are a product of specific formula creation, and ‘if’ statements because of underlying conditions that are within the data. Taking one new column at a time, the Product Price references our Product table for the price of each different item (shoulder pads, helmet, cleats, etc). The Revenue is simply calculated by multiplying the Product Price by the quantity sold by the respective salesperson for that row. Now for the more involved and advanced column is the Profit Margin. The Profit Margin is calculated as Revenue minus the price at which the Supplier sells it to JM Sports for. In essence, what we need is the Revenue minus the Quantity multiplied by the Supplier price and then we will obtain the Profit Margin. Notice how this formula references has an interchanging Supplier price because for every entry the Product ID will be different, hence a different Supplier price, so how do we account for this? This concept calls for a formula integrated with what is known as IF statements. Simply, if the Product ID is equal to 1 then the Supplier price is equal to 114.5, if the Product ID is equal to 2 then the Supplier price is equal to 96.25 and so on. Not only is this an if statement, but a ‘Nested if’ statement. We saw the basic setup of the ‘if’ statement but what if the Product ID is not equal to the number listed? This is where we Nest the If statement. You can think of it inversely and say “what is the Product is not 1?” well then maybe the Product ID is 2,3,4, or 5. This logic makes for one large nested if statement that goes through the iterations of each Product ID number until the specific row has a matching Product ID, then Excel pulls the respective Supplier selling price from the Supplier table and applies it to our Profit Margins formula.

Analyze!

2019-07-22+%284%29.jpg

Now that our Sales table is in order and has the variables needed to conduct a reasonable analysis, we can do just that. Excel has Vlookup functions and customizable Pivot tables. The Vlookup function is great for any dataset to retrieve desired information at an instant. So what is this function and what does it do? The Vlookup (short for Vertical lookup) lets us search through a specific column for the matching input and it returns any information you design it to return. In the case of this Sales dataset, I designed the Vlookup to search for a given Employee ID and to return how much revenue they generate for the company, along with their first and last name. I can easily change the employee ID and instantly retrieve the information listed at an instant, instead of manually scrolling through the dataset.

Seen on this worksheet is also a custom Pivot table. Seen here are 3 Pivot tables based on Branch ID, Product ID revenue, and Product ID quantity. The Pivot tables make for easy and quick analysis based on these subcategories. As shown, where the Product ID is equal to 1 there has been a generated revenue of $23,740 and so on. Where Product ID is equal to 3, there has been 322 units sold within the 3 months time. Lastly, the Branches and their overall contributed revenue. Pivot tables are flexible tools because we can sort and apply filters. If we would like to view our tables in ascending order with respect to the revenue or units sold, this is entirely achievable, or if I would like to view the performance for only 2 specific Products instead of all 5 at once. Pivot tables are a quick and easy way to analyze performance over time based on differing variables.

Visualize!

2019-07-22+%288%29.jpg

Excel allows for whats known as a dashboard. Here we see 4 different Pivot Charts along with slicers. These Pivot charts put our Pivot table work into visualizations and the slicers allow us to pick and choose which set of information we would like to see. Just from a broad visualization of sales data we can conclude which branch, product, or month is underperforming, overperforming, or right on track. Digging deeper into the data, we then use the Slicers to go into subcategories to make specific reports.

2019-07-22 (9).png

Observe, if I wanted to obtain Sales performance for the Philadelphia branch, selling only helmets, legs pads, and cleats during the month of August. The slicer options referring to numerous columns filter through the data to match the criteria chosen and the results are instantly displayed on the Pivot Charts beside the slicers.


Results

What conclusions can we make about the visualized data seen from the Pivot Charts? From a broad stroke we can see that the Boston Branch (Branch ID=3) is clearly underperforming, the worst selling product is the shoulder pads (Product ID=2), the most generated revenue product is the also the league admission (Product ID=4) and the month of July is underperforming. There can be a number of reasons why the data has panned out this way. Boston may have the lowest revenue of the three branches because compared to New York and Philadelphia the population is relatively small. Although, Boston is known to have a huge football fan base so maybe the Boston branch is underperforming. The worst selling product being the shoulder pads could be because of an outside competitor that has a much cheaper sale price, or maybe shoulder pads are a one time purchase for football athletes because of their durability. The league admission generates the most revenue due to its high price compared to other products as well as the league admission itself is an annual purchase. For football athletes, the league admission must be paid for every season they would like to participate in so this could explain why the revenue is much higher than other products. July underperforming could be due to football season starting in September so the participants are not thinking about football equipment just yet. July is typically still off season training so JM Sports should adopt training gear into their inventory to increase July revenue.

The speculations for each variable and subcategory within the data can be endless. I personally enjoy trying to find solutions for underperforming areas of a company based on market research and proposing new ideas to improve these areas.

Next we are going to look at a different dataset using R. I will show the practices of data preparation, and basic analysis within R’s complex software.