Here are 2 ways to count duplicates in a Python List:
1. Using the “count” method:
my_list = ["a", "a", "z", "z", "s", "a", "a", "b", "b", "k"]
my_dict = {i: my_list.count(i) for i in my_list}
print(my_dict)
The result:
{'a': 4, 'z': 2, 's': 1, 'b': 2, 'k': 1}
2. Using collections:
import collections
my_list = ["a", "a", "z", "z", "s", "a", "a", "b", "b", "k"]
my_dict = dict(collections.Counter(my_list))
print(my_dict)
The result:
{'a': 4, 'z': 2, 's': 1, 'b': 2, 'k': 1}
Keeping only the duplicates
To keep only the duplicates (while filtering out the non-duplicate values):
import collections
my_list = ["a", "a", "z", "z", "s", "a", "a", "b", "b", "k"]
counter = collections.Counter(my_list)
my_dict = {k: v for k, v in counter.items() if v > 1}
print(my_dict)
The result:
{'a': 4, 'z': 2, 'b': 2}
Create a function
Here is a function to count duplicates in a list using collections:
import collections
def count_duplicates(my_list: list) -> dict:
"""
Function to count duplicates values in a list
:param my_list: The input list to count the duplicates
:return: The dictionary that contains the duplicates
"""
counter = collections.Counter(my_list)
return {k: v for k, v in counter.items() if v > 1}
example_list = ["a", "a", "z", "z", "s", "a", "a", "b", "b", "k"]
print(count_duplicates(example_list))
The result:
{'a': 4, 'z': 2, 'b': 2}
Test the function
To test the “count_duplicates” function using Pytest:
import pytest
from app.list_operations import count_duplicates
@pytest.mark.parametrize(
"input_list, expected_output",
[
(["a", "a", "z", "z", "s", "a", "a", "b", "b", "k"], {"a": 4, "z": 2, "b": 2}),
(["a", "a"], {"a": 2}),
(["a"], {}),
(["aa", "aa", "bbb", "bb", "cc", "cc"], {"aa": 2, "cc": 2}),
([1, 1, 3, 1, 5, 5, 9], {1: 3, 5: 2}),
],
)
def test_count_duplicates(input_list, expected_output):
assert count_duplicates(input_list) == expected_output
The result:
============================= test session starts =============================
collecting ... collected 5 items
test_list_operations.py::test_count_duplicates[input_list0-expected_output0] PASSED [ '20%']
test_list_operations.py::test_count_duplicates[input_list1-expected_output1] PASSED [ '40%']
test_list_operations.py::test_count_duplicates[input_list2-expected_output2] PASSED [ '60%']
test_list_operations.py::test_count_duplicates[input_list3-expected_output3] PASSED [ '80%']
test_list_operations.py::test_count_duplicates[input_list4-expected_output4] PASSED ['100%']
============================== 5 passed in '0.01s' ==============================