class: center, middle, inverse, title-slide .title[ # BANL 6100: Business Analytics ] .subtitle[ ## Introduction ] .author[ ### Mehmet Balcilar
mbalcilar@newhaven.edu
] .institute[ ### Univeristy of New Haven ] .date[ ### 2023-08-28 (updated: 2024-08-29) ] --- background-image: url(images/ba1.jpg) background-position: 20% 10% class: bottom, right .pull-right[  ] --- ## Outline * About me * About the course * About Business Analytics * About data * About R * Future trends in Business Analytics --- ## Welcome to Business Analytics! **Instructor** - Mehmet, [mbalcilar@newhaven.edu](mailto:mbalcilar@newhaven.edu) - Web: [**here**](https://www.newhaven.edu/faculty-staff-profiles/mehmet-balcilar.php), [**here**](https://www.mbalcilar.net), & [**here**](https://scholar.google.com/citations?user=iNbxXPkAAAAJ&hl=en) - Professor in Business Analytics, PhD in Econometrics, passionate for harnessing the power of data analytics. Ranked in the **top 2%** most influential scientests worldwide since 2019 in [ranking studies by Stanford University & Elsevier](https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/6). --- ## Contacting me Email: [mbalcilar@newhaven.edu](mailto:mbalcilar@newhaven) - Use the Canvas for this class - Walk in during office hours I am also happy to arrange meetings by Zoom if you would like a more personal interaction. We can and will find a time that works for both of us! --- ## Course material The entire course is accessible through [Canvas](canvas.newhaven.edu)! - You will find lectures here - You will submit assignments here - You will find links to all documents here --- ## Software needed for class - [Microsift Excel](https://www.microsoft.com/en-us/microsoft-365/excel) (UNH provides a licensed copy for you) - [R & RStudio](https://r-project.org/) --- ## How will this course help you? - Learn the principles of analytics - Learn to communicate with Data Analysts - Learn when Data Analytics should be used - Understand when Analytics is not being correctly applied --- ## How can you be successful in this class? - Do the required readings, and then be active in discussions - Be ready to “get your hands dirty” in lab - Collaborate with classmates (it's required!) - Always ask questions! - I only know what is interesting to ME - Tell me what is interesting to YOU, and I can talk about it --- ## Let's Get Started! --- ## Play a Game: [Rock, Paper, Scissors](https://www.youtube.com/watch?v=qRn282Wbk0s)  Start by playing a simple game (I recommend about 20 rounds) of [Rock, Paper, Scissors](https://www.afiniti.com/corporate/rock-paper-scissors) [https://www.afiniti.com/corporate/rock-paper-scissors](https://www.afiniti.com/corporate/rock-paper-scissors) [https://www.youtube.com/watch?v=qRn282Wbk0s](https://www.youtube.com/watch?v=qRn282Wbk0s) --- ## Where do we actually use models like this? [The Economist](http://www.economist.com/news/finance-and-economics/21705329-governments-have-much-gain-applying-algorithms-public-policy) gives us a few examples: - Predict Repeat Offenders - Predict Heart Attacks in Hospital Patients - Predict at-risk Students, Police Officers - Find children at risk of lead poisoning --- ## Have a look at this data:  --- class: center middle ## Tell me what what were the numbers you have seen? --- ## Have a look at this data:  --- class: center middle ## Tell me what what were the numbers you have seen? --- ## Have a look at this data:  --- class: center middle ## Tell me what what were the numbers you have seen? --- ## Why are these models so valuable to us? Computers have the ability to "see" lots of data points at once, so if we use algorithms carefully, we can utilize vast amounts of data to inform our decisions. --- ## What is data? **Data** is a representation of objects or information from the real world --- ## What is data? **Data** is a representation of objects or information from the real world - Census responses are data ---  --- ## What is data? **Data** is a representation of objects or information from the real world - Census responses are data - Blueprints are data ---  --- ## What is data? **Data** is a representation of objects or information from the real world - Census responses are data - Blueprints are data - Images are data ---  --- ## What is data? **Data** is a representation of objects or information from the real world - Census responses are data - Blueprints are data - Images are data - Financial records are data ---  --- ## What is data? **Data** is a representation of objects or information from the real world - Census responses are data - Blueprints are data - Images are data - Financial records are data - CAD designs are data ---  --- ## Big Data .pull-left[ **Big data:** Any set of data that is too large or too complex to be handled by standard data-processing techniques and typical desktop software. IBM describes the phenomenon of big data through the four Vs: - Volume - Velocity - Variety - Veracity ] .pull-right[  ] --- ## What is Business Analytics? - **Business analytics** is the scientific process of learning from data, transforming data into insight for making better decisions - *Broader* than **business intelligence** which focuses on describing and predicting performance. - *Broader* than **econometrics/statistics** which typically starts from theory (hypotheses or models), and analysts assess if the data supports or refutes - *Narrower* than **data science** as we are primarily focusing on *business problems* --- ## Getting The Right Information - Only 36% of CIO’s believe management is using the right Information to run the business. (*Gartner Research 2020*) - Less than one in **ten** corporate executives believe they have the **necessary** information when they need it to make **critical** business decisions. - More than **half** of these senior executives are concerned that as a result of *missing* information, they may be making poor decisions and a quarter believe that management frequently or always gets its decisions wrong. (*Economist Intelligence Unit 2020*) --- ## So what? ### The biggest question in business analytics is ‘Why?’ --- ## So what? - Why is this happening? If you notice that there’s a pattern, ask, **“Why?”** - Is there something wrong with the data or is this an actual pattern going on? - Can we conclude anything from this pattern? A natural curiosity will definitely give you a good foundation.-- *Carla Gentry, Data Scientist at Talent Analytics* - **[Business analysts** are] able to think of ways touse data to solve problems that otherwise would have been unsolved, or solved using only intuition.*-- Peter Skomoroch, Former Principal Data Scientist at Link* --- ## Improving Productivity and Collaboration at Microsoft - moving the 1,200-person group from five buildings to four could improve collaboration - move resulted in a 46 percent decrease in meeting travel time - estimated savings of $520,000 per year in employee time  --- ## Enhancing Customer Support at Uber - Uber developed a Customer Obsession Ticket Assistant (COTA) in early 2018 - a tool that uses machine learning and natural language processing to help agents improve their speed and accuracy when responding to support tickets. - nearly seven percent relative reduction in average handle time per ticket  --- ## Business Analytics Jobs  --- ## Data Science Jobs  --- ## Data Jobs! 🤔  --- ## FACT Framework -- ### .red[F]rame the question -- ### .red[A]ssamble the data -- ### .red[C]alculate the results -- ### .red[T]ell others the results --- ## A Categorization of Analytical Methods and Models .pull-left[ **Descriptive analytics:** Encompasses the set of techniques that describes what has happened in the past; examples include: - Data queries - Reports - Descriptive statistics - Data visualization (including data dashboards) - Data-mining techniques - Basic what-if spreadsheet models ] .pull-right[ **Predictive analytics:** Consists of techniques that use models constructed from past data to predict the future or ascertain the impact of one variable on another. **Prescriptive Analytics:** Indicates a best course of action to take: - Provide a forecast or prediction, but do not provide a decision. - A forecast or prediction, when combined with a rule, becomes a prescriptive model. - Prescriptive models that rely on a rule or set of rules are often referred to as rule-based models. ] --- ## Business Analytics in Practice  --- ## Why now? Why not when computers were invented? - More Data - There has been a tremendous increase in the amount of data available to companies - More Computer Power - In 2005, my laptop had ¾GB of RAM, and cost over $2000 - As of 2014, my laptop (still at just over $2000) had 16GB of RAM, nearly a 2000% increase! The same was true of the increase in processor performance. - Today, a similar laptop can be purchased for closer to $1000 --- ## Why now? - Synergy between Data and Computation Power - We now have enough data to make meaningful conclusions - We have enough computational power! - [OmniSci](https://www.omnisci.com/demos/ships/), [more OmniSci](https://www.omnisci.com/blog/visualizing-1.7-billion-stars-in-the-galaxy-at-the-speed-of-light) - NVIDIA GTX 1080 Ti: ~11 TFlops (11 trillion operations per second) - Only costs $599 --- ## So where can we get the data we need? Anywhere that records are kept, if the owner is willing to share. - Facebook, Twitter, and LinkedIn all collect very robust information on users and their interactions - We could also record our own data if needed: surveys, usage statistics, credit card transactions, etc. --- ## So where can we get the data we need? Some of my all-time favorites: - [IPUMS.org](https://www.ipums.org/) - [Data.gov](https://www.data.gov/) - [ICPSR@UMich](https://www.icpsr.umich.edu/icpsrweb/) --- ## So the data will magically analyze itself, right?! - Just like a book that is never read, data that is only recorded and never used is not that useful - We need to find a way to UTILIZE data - That is what we will learn about during the rest of this class! --- ## We need to ask questions! --- ## Exploring data helps us to find the questions we care about Let's use `R` to explore our data --- ## Process of reporting? - Do analysis in one tool (Excel/R/Python/SAS/SPSS/Minitab/Stata/Matlab) -- - Make tables and charts in another tool (Excel/Tableau/PowerBI) -- - Screenshot or paste everything into a text editor (Word/LaTeX) -- - Turn it into a final product (PDF/PowerPoint) -- .center[  ] --- ## We can do it all with R! <img src="images/environmental-data-science-r4ds-general.png" width="95%" style="float:center" /> --- ## What is R? - R is an **open source software** suite including - an effective **data handling** and storage facility, - a suite of operators for **calculations** on arrays, in particular matrices, - an integrated collection of intermediate tools for **data analysis and graphics**, and - a **programming language** which includes conditionals, loops, user-defined recursive functions and input and output facilities. -- ## R works best when combined with RStudio RStudio is an **open source** integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. --- ## [*Action Item*] Download and Install R & RStudio ### See instruction on Canvas page ### OR ### See [my tutorial page](https://banl6100.netlify.app/tutorials/startingr) --- ## Industry and R .pull-left[ - R has an impressive growth. [See this news](https://stackoverflow.blog/2017/10/10/impressive-growth-r/) - Industry adaptation of R beats any other langauge ] .pull-right[  ] --- ## R Jobs: US  --- ## R Jobs: Germany  --- ## R packages - The open source nature of R allows users to create software extensions called *packages*. - Packages are the fundamental units of reproducible R code and contain groups of useful functions. - [CRAN is a great place to find packages.](https://cran.r-project.org/web/packages/available_packages_by_name.html) - [You can make your own packages to support your work!](https://www.youtube.com/watch?v=J45Hs9XyN68) .center[  ] --- ## [*Action Item*] Installing Packages ### Install some packages - Open RStudio - Run the following command in your R console window ``` r install.packages(c('dplyr', 'tidyr', 'rmarkdown', 'bslib', 'ggplot2', 'fredr', 'highcharter', 'plotly', 'DT')) ``` --- ## R users are an amazingly supportive community .pull-left[ <img src="images/code_hero_rstats.png" width="150%" style="float:center" /> ] .pull-right[ - [R Ladies](https://rladies.org/) - worldwide organization to promote gender diversity - R User Groups - in many major cities across the world - R conferences - [rstudio::conf](https://www.rstudio.com/conference/), [useR!](https://user2022.r-project.org/), many local options - [R Twitter (#rstats)](https://twitter.com/hashtag/rstats) - [Hadley Wickham](https://twitter.com/hadleywickham) - [Jenny Bryan](https://twitter.com/jennybryan) - [Dirk Eddelbuettel](https://twitter.com/eddelbuettel) - [Alison Presmanes Hill](https://twitter.com/apreshill/) - [Tom Mock](https://twitter.com/thomas_mock) - [Julia Silge](https://twitter.com/juliasilge) ] --- class: inverse center middle # Making Dazzling Reports ---  --- ## R Project "R experts keep all the files associated with a project together — input data, R scripts, analytical results, figures. This is such a wise and common practice that RStudio has built-in support for this via projects." -- Hadley Wickham ([R for Data Science](https://r4ds.had.co.nz/workflow-projects.html)) -- R projects define your working directories: - Paths are relative rather than fixed - Your work is portable and tidy --- ## [*Action Item*] Open/Make an R project ### [Create a Project (Video)](https://youtu.be/CojhAWZ92FE) 1. Open RStudio 2. Configure global options to avoid saving .Rdata (1-time step) - Tools -> Global Options -> Workspace -> Uncheck the box. Change the dropdown to Never. - Restart RStudio 3. Create an R project to associate with a set of files ### [Open the Project for dinosaurs dataset](https://github.com/mine-cetinkaya-rundel/dinosaurs) [(Video)](videos/open_rproject.mp4) 1. [Download files from Github](https://github.com/mine-cetinkaya-rundel/dinosaurs) [(See Video)](videos/open_rproject.mp4) 2. Click on R project file --- ## Folder Infrastructure .pull-left[ - Store code (R scripts) separate from your other reporting (e.g. R Markdown) files - Consider pre-processing and storage of summary data - Don't try to fit too much into one report; separate reports can sit in one folder ] .pull-right[  ] --- .center[] --- class: segue-yellow, center, middle # Future Trends --- ## Implementing AI In Data Analytics  **Search volume for “AI analytics” is growing, already up 355% in the past 5 years.** --- ## Implementing AI In Data Analytics  **raditional BI heavily relies on IT, but ML-powered analytics puts the data in the hands of business users.** --- ## Big Data Automation  --- ## Big Data Automation > Everything that mankind does on the earth will be converted into data, as IDC predicts by 2025, the total amount of digital data created worldwide will rise to 175 zettabytes (from approximately 40 zettabytes in 2019),  --- ## Decision Intelligence  --- ## Edge Data and Analytics > Handling the data at the device that generates it.  --- ## Data Visualization / Storytelling  --- ## Democratization of Data Systems  --- class: segue-yellow, center, middle #End of lecture ... #Next topic: Advanced Excel Functions and Business Analysis #[Mehmet Balcilar](https://www.mbalcilar.net) #[mbalcilar@newhaven.edu](mailto:mbalcilar@newhaven.edu)