Handbook of Graphs and Networks in People Analytics

Author: Keith McNulty
File Type: pdf
Size: 10.7 MB
Language: English
Pages: 268

Handbook of Graphs and Networks in People Analytics With Examples in R and Python

Introduction

People analytics has rapidly evolved into a vital discipline for understanding human behavior, organizational performance, and workforce optimization. As businesses harness data to make informed decisions about hiring, collaboration, retention, and productivity, the need for intuitive and powerful analytical tools becomes essential. One of the most transformative tools in this domain is graph theory and network analysis.

Graphs and networks aren’t just abstract mathematical concepts. They provide a structured way to model relationships — whether connections between employees, communication patterns, skills networks, or influence flows in an organization. This handbook is tailored for beginner engineering students and professionals eager to understand how graphs and networks serve as fundamental pillars in people analytics.

Throughout this article, you’ll learn both the conceptual groundwork and practical applications of these tools. From theory to real-world implementation, we will break down complex ideas into digestible sections.


Background Theory

At the heart of people analytics lies relationships. Traditional databases handle individual data points (like age, salary, performance scores), but fail to fully capture how individuals connect. This is where graph theory comes into play.

What are Graphs and Networks?

At their core:

  • Graph: A mathematical structure used to model pairwise relations between objects. It consists of nodes (vertices) and edges (links).

  • Network: A real-world interpretation of a graph, where nodes represent entities (persons, roles, teams), and edges represent connections (communication, collaboration, reporting lines).

Graphs can be:

  • Directed or Undirected: Directed graphs have one-way connections (A → B), whereas undirected graphs have two-way links (A ↔ B).

  • Weighted or Unweighted: Weights represent intensity, frequency, or value of relationships.

Networks are everywhere — from social interactions to transportation maps and communication channels.


Technical Definition

In engineering terms, a graph G is defined as:

G = (V, E)
where V is a set of vertices and E is a set of edges connecting pairs of vertices.

Important foundational definitions:

  • Vertex (Node): An entity or object in a graph.

  • Edge (Link): A connection or relationship between two vertices.

  • Adjacency: When two nodes share an edge.

  • Degree: The number of connections a node has.

  • Path: A sequence of connected vertices.

  • Cycle: A path where the start and end nodes are the same.

In people analytics:

  • Nodes = individuals, roles, teams

  • Edges = connection types like communication or collaboration

There are also advanced concepts such as centrality, clustering coefficients, and communities that help measure the importance and structure of networks.


Step-by-Step Explanation

Understanding how to work with graphs in people analytics involves several steps:

1. Data Acquisition

Start by collecting relevant data. In people analytics, this may include:

  • Email logs

  • Project collaboration records

  • Organizational charts

  • Social media interactions

  • Surveys or HR databases

2. Clean and Prepare Data

Raw data needs transformation:

  • Remove duplicates

  • Normalize formats

  • Map identifiers to unique individuals

  • Ensure consistency

3. Convert to Graph Format

You must convert relational data to graph form:

  • Nodes: Unique individuals

  • Edges: Relationships or interactions among individuals

For example, if Alice communicated with Bob 10 times last week, a weighted edge could be created:
Alice —10→ Bob

4. Build the Graph

Use graph libraries (NetworkX, Neo4j, GraphFrames) to construct data structures. For example, in Python with NetworkX:

import networkx as nx

G = nx.Graph()
G.add_edge('Alice', 'Bob', weight=10)

5. Analyze the Network

Apply metrics:

  • Degree Centrality: Who has the most direct connections?

  • Betweenness Centrality: Who bridges different groups?

  • Clustering Coefficient: Are clusters forming?

6. Visualize the Graph

Visualization helps reveal hidden patterns using tools like:

  • Gephi

  • Cytoscape

  • D3.js

  • Python libraries (Matplotlib, Plotly)

7. Interpret Results

Translate metrics and visuals into actionable insights:

  • Identify influencers

  • Detect isolated individuals

  • Map collaboration bottlenecks


Detailed Examples

To illustrate, let’s look at two example scenarios:

Example 1: Communication Network Analysis

Scenario: A company wants to understand email communication patterns.

Steps:

  1. Extract email logs (sender, receiver, timestamp).

  2. Build an undirected graph where nodes are employees and edges show communication frequency.

  3. Assign weights based on message count.

Analysis Objectives:

  • Identify central communicators (high degree centrality)

  • Detect departments that are not communicating with others (isolates)

Outcome: HR can create opportunities to connect distant teams and promote knowledge sharing.


Example 2: Team Collaboration Graph

Scenario: A software company wants to see how well engineers collaborate on projects.

Steps:

  1. Collect project assignment data.

  2. Map individuals to nodes.

  3. If two people worked together on a project, connect them with an edge.

  4. Weight the edge by number of shared projects.

Analysis Objectives:

  • Find strong collaboration pairs

  • Discover underutilized talent

  • Assess team cohesion

Outcome: Manager uses this insight to improve team formation and reduce burnout.


Real-World Application in Modern Projects

1. Organizational Network Analysis (ONA)

Companies like Google and Deloitte use ONA to map informal networks alongside formal reporting structures. ONA reveals hidden influencers, cross-department collaboration patterns, and information bottlenecks.

2. Talent Mobility and Skill Networks

By modeling employees’ skills and job pathways, organizations can identify skill gaps, plan training programs, and determine optimal career paths.

3. Remote Work and Collaboration Patterns

With the rise of virtual teams, network analysis helps understand digital communication flows, ensuring remote workers are integrated and supported.

4. Diversity & Inclusion Initiatives

Network analysis can uncover clustering by demographic attributes, informing strategies to improve cross-group collaborations and inclusion efforts.

5. Customer and Employee Advocacy Networks

Graphs also help identify employee ambassadors and detractors by analyzing how individuals influence others on internal and external platforms.


Common Mistakes

Even with powerful tools, several common errors can undermine your graph analysis:

1. Poor Data Quality

Garbage in, garbage out. If your data contains errors, your network model will be flawed.

2. Ignoring Edge Weights

Treating all interactions equally misses depth — e.g., occasional vs. daily communication.

3. Focusing on Quantity Over Quality

A person with many connections isn’t necessarily influential; context matters.

4. Misinterpreting Centrality

Different centrality metrics answer different questions. Pick the right one for your goal.

5. Overlooking Privacy and Ethics

Network data can reveal sensitive information about relationships. Respect privacy and compliance requirements.


Challenges & Solutions

Challenge 1: Large Scale Networks

As networks grow (thousands of nodes), analysis becomes computationally expensive.

Solution:
Use distributed graph engines (Apache Spark GraphX), subset sampling, or cloud-based graph databases.


Challenge 2: Dynamic Networks

Relationships change over time, making static snapshots incomplete.

Solution:
Implement temporal graphs that capture time-based edge formation, allowing trend detection.


Challenge 3: Data Sensitivity

Handling employee communication data raises ethical concerns.

Solution:
Use anonymization, secure storage, and consent mechanisms. Follow GDPR and other privacy standards.


Case Study

Case Study: Improving Cross-Team Collaboration at TechCo

Background:
TechCo, a mid-sized software company, noticed stagnation in innovation. Teams were operating in silos.

Objective:
Map collaboration patterns to identify cross-communication opportunities.

Approach:

  1. Collected Slack and GitHub interaction logs.

  2. Created a weighted directed graph:

    • Nodes: engineers, designers, product managers

    • Edges: collaboration frequency

  3. Applied centrality and clustering analysis.

Key Findings:

  • High internal collaboration within engineering, low interaction with design.

  • Several engineers acted as bridges between departments.

  • Product managers were peripheral with limited technical communication.

Actions:

  • Formed cross-functional task forces.

  • Introduced weekly cross-department syncs.

  • Recognized bridge employees as “collaboration champions.”

Outcome:
Within six months:

  • Project delivery time decreased by 18%

  • Employee engagement scores improved

  • Company saw a 22% increase in cross-team innovations


Tips for Engineers

  1. Start Simple: Begin with basic metrics before diving into advanced analysis.

  2. Use Visual Tools: Network diagrams often reveal patterns that numbers miss.

  3. Define Clear Objectives: Ask what questions you want to answer with your graph.

  4. Benchmark Metrics: Compare against historical or industry norms.

  5. Collaborate with Domain Experts: HR and organizational behavior specialists add valuable context.

  6. Respect Privacy: Always adhere to ethical data practices.


FAQs

1. What is the difference between a node and an edge?

Answer: A node represents an entity (e.g., employee) while an edge represents a relationship (e.g., communication) between two nodes.


2. Why use network analysis in people analytics?

Answer: It helps uncover hidden patterns, influence, and connection dynamics that traditional data analysis cannot reveal.


3. Can graphs model time-based changes in relationships?

Answer: Yes, temporal graphs capture how edge properties change over time, enabling trend analysis.


4. Which tools are best for graph visualization?

Answer: Gephi, Cytoscape, D3.js, and Python libraries like NetworkX with Matplotlib or Plotly are popular choices.


5. How do you interpret a node with high centrality?

Answer: High centrality indicates influence or connectivity, but the specific meaning depends on the metric (degree, closeness, betweenness).


6. Are network graphs only for social data?

Answer: No. Networks can model any system of entities and relationships — organizational, technical, or biological.


7. What is a weighted graph?

Answer: A graph where edges carry a numerical value representing strength, frequency, or significance of the relationship.


8. Do network analyses require coding skills?

Answer: Basic analyses can be done with tools requiring minimal coding, but more advanced work benefits from programming knowledge.


Conclusion

Graphs and networks are transformative tools in people analytics — offering a lens to view relationships, influence, and structure within complex human systems. Whether you are a student learning the foundations of data modeling, or a professional applying insights to organizational challenges, mastering graph analysis unlocks a world of powerful analytical possibilities.

This handbook has guided you from basic definitions to real-world applications, practical challenges, and case studies. As you begin to explore your own datasets, remember that every network tells a story — and with the right tools, you can reveal it.

Download
Scroll to Top