# How to Calculate Summary Statistics in R DataFrame

The summarize() function can be used to calculate summary statistics in R DataFrame.

Here are the steps to derive the summary statistics for a given DataFrame.

## Steps to calculate summary statistics in R DataFrame

### Step 1: Install the dplyr package

To start, install the dplyr package if you haven’t already done so:

`install.packages("dplyr")`

### Step 2: Create a DataFrame

Next, create a DataFrame in R as follows:

```# Create a DataFrame
df <- data.frame(
StudentID = 1:5,
Name = c("Alice", "Bob", "Charlie", "David", "Eve"),
Score = c(85, 92, 78, 89, 95)
)

# Print the DataFrame
print(df)```

The result:

``````  StudentID    Name  Score
1         1   Alice     85
2         2     Bob     92
3         3 Charlie     78
4         4   David     89
5         5     Eve     95``````

### Step 3: Calculate Summary statistics

Finally, calculate the summary statistics using the summarize() function:

```# Load the dplyr package
library(dplyr)

# Create a DataFrame
df <- data.frame(
StudentID = 1:5,
Name = c("Alice", "Bob", "Charlie", "David", "Eve"),
Score = c(85, 92, 78, 89, 95)
)

# Calculate summary statistics for the student Score
summary_stats <- df %>%
summarize(
Mean_Score = mean(Score),
Median_Score = median(Score),
Min_Score = min(Score),
Max_Score = max(Score),
StdDev_Score = sd(Score),
Variance_Score = var(Score),
)

# View the summary statistics
print(summary_stats)```

The result:

``````      Mean_Score  Median_Score  Min_Score  Max_Score  StdDev_Score  Variance_Score
1        87.8          89          78         95        6.610598         43.7``````
