MSBA IS 6813 | Spring 2026
| Core Tools π οΈ | π Specs | π Data Room | π Shared Google Drive | π Group Dashboard | π» GitHub Repo |
|---|---|---|---|---|---|
| Assignments π | π 01 Business Problem Statement | π 02 EDA | π 03 Modeling | π 04 Presentation | π« |
| Phase | Milestone | Hard Deadline |
|---|---|---|
| π’ | Business Problem Statement | Jan 28 |
| π‘ | EDA Group Notebook | Feb 18 |
| βͺ | Modeling Notebook | Mar 18 |
| βͺ | Practice Presentation | Apr 05 |
| βͺ | Final Sponsor Delivery | Apr 08/15 |
| βͺ | Portfolio & Peer Eval | Apr 19 |
Primary Directive: Copy this block exactly into the top of every .qmd file.
---
title:
subtitle:
date: "Spring 2026"
format:
html:
theme: journal
toc: true
toc-depth: 3
toc-float: true
number-sections: false
code-fold: true
code-tools: true
df-print: paged
highlight-style: github
pdf:
documentclass: article
geometry:
- margin=1in
toc: true
number-sections: false
colorlinks: true
mainfont: "Arial"
sansfont: "Arial"
monofont: "Courier New"
editor: visual
---
Rule: Use these blocks to initialize your environment. They include Dynamic Core Selection to maximize performance on any machine without crashing it (N-1 logic).
# Load Core Packages
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, here, parallel, doParallel)
# Dynamic Parallel Processing (Detects your hardware)
# Leaves 1 core free for the OS to prevent freezing
num_cores <- parallel::detectCores(logical = FALSE)
cl <- makeCluster(num_cores - 1)
registerDoParallel(cl)
print(paste("Cluster active with", num_cores - 1, "cores."))
import pandas as pd
import numpy as np
import multiprocessing
from pyprojroot import here
# Dynamic Core Selector
# Use 'n_jobs' in Scikit-Learn models (e.g., n_jobs=n_jobs)
n_jobs = multiprocessing.cpu_count() - 1
print(f"Parallel processing enabled: {n_jobs} cores available.")
Standard R and Python scripts run linearly on a single CPU core, leaving 80-90% of your computerβs power idle. By enabling parallel processing (as shown above), we distribute computations across multiple cores simultaneously.
Try not to use absolute paths (e.g., C:/Users/Thomas/...).
For R (using here):
library(here)
# Automatically finds the project root (where .git is)
df <- read.csv(here::here("data", "application_train.csv"))
For Python (using pyprojroot):
from pyprojroot import here
# Automatically finds the project root
path = here("data/application_train.csv")
df = pd.read_csv(path)
Visual map of how files, data, and code interact within this repository.
βββββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββ
β π data/ β β π notebooks/ β β π output/ β
β (Local Only) ββββββββΆβ (Code Execution) βββββββΆβ (Deliverables) β
β Raw .csv Files β β .qmd Analysis β β .csv / .png β
βββββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββ
β β²
β β
βββββββ (Load via 'here') βββ
βββ data/ # RAW data (Local only - Git ignored)
βββ notebooks/
β βββ 01_Business_Problem/
β βββ 02_EDA/
β βββ 03_Modeling/
β βββ 04_Presentation/
β βββ individual/ # Individual "Sandboxes" for portfolio
βββ output/ # Exported .csv results and .png plots
βββ docs/ # Meeting notes and sponsor requirements
βββ README.md # This Hub
| Team Member | Email (Personal) | Email (University) | Phone |
|---|---|---|---|
| Thomas Beck | thomasscottbeck@gmail.com | u0399590@utah.edu | +1 (801) 631-2080 |
| Max Ridgeway | [TBD] | u1230181@utah.edu | +1 (801) 597-3824 |
| Astha KC | asthakc.us@gmail.com | u1561947@utah.edu | +1 (971) 500-6757 |
Note: Before starting any work session, run
git pullto sync the latest model changes from the team.