Overview

Dataset statistics

Number of variables22
Number of observations200192
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.6 MiB
Average record size in memory176.0 B

Variable types

Categorical16
Numeric6

Alerts

BMI is highly correlated with PhysHlth and 1 other fieldsHigh correlation
PhysHlth is highly correlated with BMI and 1 other fieldsHigh correlation
Diabetes_binary is highly correlated with BMI and 1 other fieldsHigh correlation
BMI is highly correlated with PhysHlth and 1 other fieldsHigh correlation
PhysHlth is highly correlated with BMI and 1 other fieldsHigh correlation
Diabetes_binary is highly correlated with BMI and 1 other fieldsHigh correlation
BMI is highly correlated with Diabetes_binaryHigh correlation
PhysHlth is highly correlated with Diabetes_binaryHigh correlation
Diabetes_binary is highly correlated with BMI and 1 other fieldsHigh correlation
BMI is highly correlated with PhysHlth and 1 other fieldsHigh correlation
MentHlth is highly correlated with PhysHlth and 1 other fieldsHigh correlation
PhysHlth is highly correlated with BMI and 2 other fieldsHigh correlation
Diabetes_binary is highly correlated with BMI and 2 other fieldsHigh correlation
Diabetes_binary is uniformly distributed Uniform

Reproduction

Analysis started2022-07-28 05:36:30.623486
Analysis finished2022-07-28 05:37:00.887424
Duration30.26 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

HighBP
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
108430 
1.0
91762 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0108430
54.2%
1.091762
45.8%

Length

2022-07-28T11:07:00.960947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:01.056727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0108430
54.2%
1.091762
45.8%

Most occurring characters

ValueCountFrequency (%)
0308622
51.4%
.200192
33.3%
191762
 
15.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0308622
77.1%
191762
 
22.9%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0308622
51.4%
.200192
33.3%
191762
 
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0308622
51.4%
.200192
33.3%
191762
 
15.3%

HighChol
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
111317 
1.0
88875 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0111317
55.6%
1.088875
44.4%

Length

2022-07-28T11:07:01.135886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:01.224109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0111317
55.6%
1.088875
44.4%

Most occurring characters

ValueCountFrequency (%)
0311509
51.9%
.200192
33.3%
188875
 
14.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0311509
77.8%
188875
 
22.2%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0311509
51.9%
.200192
33.3%
188875
 
14.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0311509
51.9%
.200192
33.3%
188875
 
14.8%

CholCheck
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
110920 
1.0
89272 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0110920
55.4%
1.089272
44.6%

Length

2022-07-28T11:07:01.300375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:01.393430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0110920
55.4%
1.089272
44.6%

Most occurring characters

ValueCountFrequency (%)
0311112
51.8%
.200192
33.3%
189272
 
14.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0311112
77.7%
189272
 
22.3%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0311112
51.8%
.200192
33.3%
189272
 
14.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0311112
51.8%
.200192
33.3%
189272
 
14.9%

BMI
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct197844
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.94536217
Minimum12.07818127
Maximum73.7012558
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-07-28T11:07:01.478715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum12.07818127
5-th percentile23.08314381
Q127.07612228
median32.5464077
Q341.96646118
95-th percentile52.87893448
Maximum73.7012558
Range61.62307453
Interquartile range (IQR)14.8903389

Descriptive statistics

Standard deviation9.606681251
Coefficient of variation (CV)0.2749057573
Kurtosis-0.4092645864
Mean34.94536217
Median Absolute Deviation (MAD)6.634771347
Skewness0.6640821088
Sum6995781.944
Variance92.28832467
MonotonicityNot monotonic
2022-07-28T11:07:01.589035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27.499851233
 
< 0.1%
28.716093063
 
< 0.1%
28.758600233
 
< 0.1%
53.492092133
 
< 0.1%
27.375801093
 
< 0.1%
34.544780733
 
< 0.1%
37.251758583
 
< 0.1%
53.239849093
 
< 0.1%
32.249317173
 
< 0.1%
28.670061113
 
< 0.1%
Other values (197834)200162
> 99.9%
ValueCountFrequency (%)
12.078181271
< 0.1%
12.512852671
< 0.1%
12.79787351
< 0.1%
13.030323031
< 0.1%
13.054581641
< 0.1%
13.240945821
< 0.1%
13.497028351
< 0.1%
13.556580541
< 0.1%
13.730919841
< 0.1%
13.741152761
< 0.1%
ValueCountFrequency (%)
73.70125581
< 0.1%
72.449028021
< 0.1%
70.999160771
< 0.1%
70.824111941
< 0.1%
70.806365971
< 0.1%
70.473617551
< 0.1%
70.25672151
< 0.1%
69.976684571
< 0.1%
69.974121091
< 0.1%
69.87512971
< 0.1%

Smoker
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
111795 
1.0
88397 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0111795
55.8%
1.088397
44.2%

Length

2022-07-28T11:07:01.700114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:01.789806image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0111795
55.8%
1.088397
44.2%

Most occurring characters

ValueCountFrequency (%)
0311987
51.9%
.200192
33.3%
188397
 
14.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0311987
77.9%
188397
 
22.1%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0311987
51.9%
.200192
33.3%
188397
 
14.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0311987
51.9%
.200192
33.3%
188397
 
14.7%

Stroke
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
1.0
104426 
0.0
95766 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0104426
52.2%
0.095766
47.8%

Length

2022-07-28T11:07:01.865388image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:01.961519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0104426
52.2%
0.095766
47.8%

Most occurring characters

ValueCountFrequency (%)
0295958
49.3%
.200192
33.3%
1104426
 
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0295958
73.9%
1104426
 
26.1%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0295958
49.3%
.200192
33.3%
1104426
 
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0295958
49.3%
.200192
33.3%
1104426
 
17.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
107304 
1.0
92888 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0107304
53.6%
1.092888
46.4%

Length

2022-07-28T11:07:02.042549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.132008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0107304
53.6%
1.092888
46.4%

Most occurring characters

ValueCountFrequency (%)
0307496
51.2%
.200192
33.3%
192888
 
15.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0307496
76.8%
192888
 
23.2%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0307496
51.2%
.200192
33.3%
192888
 
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0307496
51.2%
.200192
33.3%
192888
 
15.5%

PhysActivity
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
1.0
101184 
0.0
99008 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0101184
50.5%
0.099008
49.5%

Length

2022-07-28T11:07:02.209544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.297570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0101184
50.5%
0.099008
49.5%

Most occurring characters

ValueCountFrequency (%)
0299200
49.8%
.200192
33.3%
1101184
 
16.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0299200
74.7%
1101184
 
25.3%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0299200
49.8%
.200192
33.3%
1101184
 
16.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0299200
49.8%
.200192
33.3%
1101184
 
16.8%

Fruits
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
108051 
1.0
92141 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0108051
54.0%
1.092141
46.0%

Length

2022-07-28T11:07:02.374870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.470314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0108051
54.0%
1.092141
46.0%

Most occurring characters

ValueCountFrequency (%)
0308243
51.3%
.200192
33.3%
192141
 
15.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0308243
77.0%
192141
 
23.0%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0308243
51.3%
.200192
33.3%
192141
 
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0308243
51.3%
.200192
33.3%
192141
 
15.3%

Veggies
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
107286 
1.0
92906 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0107286
53.6%
1.092906
46.4%

Length

2022-07-28T11:07:02.550779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.642165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0107286
53.6%
1.092906
46.4%

Most occurring characters

ValueCountFrequency (%)
0307478
51.2%
.200192
33.3%
192906
 
15.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0307478
76.8%
192906
 
23.2%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0307478
51.2%
.200192
33.3%
192906
 
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0307478
51.2%
.200192
33.3%
192906
 
15.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
102411 
1.0
97781 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0102411
51.2%
1.097781
48.8%

Length

2022-07-28T11:07:02.722702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.814914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0102411
51.2%
1.097781
48.8%

Most occurring characters

ValueCountFrequency (%)
0302603
50.4%
.200192
33.3%
197781
 
16.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0302603
75.6%
197781
 
24.4%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0302603
50.4%
.200192
33.3%
197781
 
16.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0302603
50.4%
.200192
33.3%
197781
 
16.3%

AnyHealthcare
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
1.0
117896 
0.0
82296 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0117896
58.9%
0.082296
41.1%

Length

2022-07-28T11:07:02.892950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:02.993428image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0117896
58.9%
0.082296
41.1%

Most occurring characters

ValueCountFrequency (%)
0282488
47.0%
.200192
33.3%
1117896
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0282488
70.6%
1117896
29.4%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0282488
47.0%
.200192
33.3%
1117896
19.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0282488
47.0%
.200192
33.3%
1117896
19.6%

NoDocbcCost
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
102772 
1.0
97420 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0102772
51.3%
1.097420
48.7%

Length

2022-07-28T11:07:03.070675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:03.162210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0102772
51.3%
1.097420
48.7%

Most occurring characters

ValueCountFrequency (%)
0302964
50.4%
.200192
33.3%
197420
 
16.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0302964
75.7%
197420
 
24.3%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0302964
50.4%
.200192
33.3%
197420
 
16.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0302964
50.4%
.200192
33.3%
197420
 
16.2%

GenHlth
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
3.0
43571 
2.0
42955 
4.0
42383 
1.0
40477 
5.0
30806 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row3.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
3.043571
21.8%
2.042955
21.5%
4.042383
21.2%
1.040477
20.2%
5.030806
15.4%

Length

2022-07-28T11:07:03.240324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:03.574564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
3.043571
21.8%
2.042955
21.5%
4.042383
21.2%
1.040477
20.2%
5.030806
15.4%

Most occurring characters

ValueCountFrequency (%)
.200192
33.3%
0200192
33.3%
343571
 
7.3%
242955
 
7.2%
442383
 
7.1%
140477
 
6.7%
530806
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0200192
50.0%
343571
 
10.9%
242955
 
10.7%
442383
 
10.6%
140477
 
10.1%
530806
 
7.7%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.200192
33.3%
0200192
33.3%
343571
 
7.3%
242955
 
7.2%
442383
 
7.1%
140477
 
6.7%
530806
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.200192
33.3%
0200192
33.3%
343571
 
7.3%
242955
 
7.2%
442383
 
7.1%
140477
 
6.7%
530806
 
5.1%

MentHlth
Real number (ℝ)

HIGH CORRELATION

Distinct198758
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.2835778274
Minimum-13.49863529
Maximum29.95352745
Zeros0
Zeros (%)0.0%
Negative138377
Negative (%)69.1%
Memory size1.5 MiB
2022-07-28T11:07:03.679623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-13.49863529
5-th percentile-6.880828428
Q1-3.188583553
median-1.966302216
Q31.407046437
95-th percentile10.64880967
Maximum29.95352745
Range43.45216274
Interquartile range (IQR)4.59562999

Descriptive statistics

Standard deviation5.43507722
Coefficient of variation (CV)-19.16608668
Kurtosis4.19952029
Mean-0.2835778274
Median Absolute Deviation (MAD)1.700447321
Skewness1.692974324
Sum-56770.01242
Variance29.54006439
MonotonicityNot monotonic
2022-07-28T11:07:03.784033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-3.5333769323
 
< 0.1%
-2.8932571413
 
< 0.1%
-2.3439888953
 
< 0.1%
-2.3781447413
 
< 0.1%
-3.2990419863
 
< 0.1%
-2.2560293673
 
< 0.1%
-2.964768413
 
< 0.1%
-2.1323971753
 
< 0.1%
-2.8491425513
 
< 0.1%
-1.4490274193
 
< 0.1%
Other values (198748)200162
> 99.9%
ValueCountFrequency (%)
-13.498635291
< 0.1%
-13.445617681
< 0.1%
-13.397555351
< 0.1%
-13.290341381
< 0.1%
-13.120910641
< 0.1%
-13.110137941
< 0.1%
-12.875812531
< 0.1%
-12.874480251
< 0.1%
-12.70591451
< 0.1%
-12.683765411
< 0.1%
ValueCountFrequency (%)
29.953527451
< 0.1%
29.948450091
< 0.1%
29.94201661
< 0.1%
29.934295651
< 0.1%
29.926513671
< 0.1%
29.924571991
< 0.1%
29.923616411
< 0.1%
29.922515871
< 0.1%
29.918207171
< 0.1%
29.915565491
< 0.1%

PhysHlth
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct194381
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.95798922
Minimum-13.73297882
Maximum29.14030457
Zeros0
Zeros (%)0.0%
Negative7984
Negative (%)4.0%
Memory size1.5 MiB
2022-07-28T11:07:03.895615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-13.73297882
5-th percentile0.2596646398
Q12.229513586
median19.80461884
Q327.96361399
95-th percentile28.52459154
Maximum29.14030457
Range42.87328339
Interquartile range (IQR)25.7341004

Descriptive statistics

Standard deviation12.84185559
Coefficient of variation (CV)0.8585282022
Kurtosis-1.937484814
Mean14.95798922
Median Absolute Deviation (MAD)9.069877625
Skewness-0.029866479
Sum2994469.778
Variance164.9132551
MonotonicityNot monotonic
2022-07-28T11:07:04.006591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27.927387245
 
< 0.1%
28.145294194
 
< 0.1%
28.097377784
 
< 0.1%
28.074415214
 
< 0.1%
28.610351564
 
< 0.1%
28.515153884
 
< 0.1%
27.879323964
 
< 0.1%
28.10641674
 
< 0.1%
28.03882794
 
< 0.1%
27.933982854
 
< 0.1%
Other values (194371)200151
> 99.9%
ValueCountFrequency (%)
-13.732978821
< 0.1%
-13.38292981
< 0.1%
-13.082541471
< 0.1%
-11.807616231
< 0.1%
-11.660846711
< 0.1%
-11.365406041
< 0.1%
-11.274729731
< 0.1%
-10.79300691
< 0.1%
-10.744480131
< 0.1%
-10.624603271
< 0.1%
ValueCountFrequency (%)
29.140304571
< 0.1%
29.133586881
< 0.1%
29.115106581
< 0.1%
29.107652661
< 0.1%
29.096935271
< 0.1%
29.096443181
< 0.1%
29.093599321
< 0.1%
29.085485461
< 0.1%
29.085231781
< 0.1%
29.085048681
< 0.1%

DiffWalk
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
1.0
106670 
0.0
93522 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0106670
53.3%
0.093522
46.7%

Length

2022-07-28T11:07:04.107311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:04.197430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0106670
53.3%
0.093522
46.7%

Most occurring characters

ValueCountFrequency (%)
0293714
48.9%
.200192
33.3%
1106670
 
17.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0293714
73.4%
1106670
 
26.6%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0293714
48.9%
.200192
33.3%
1106670
 
17.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0293714
48.9%
.200192
33.3%
1106670
 
17.8%

Sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
0.0
104165 
1.0
96027 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0104165
52.0%
1.096027
48.0%

Length

2022-07-28T11:07:04.276388image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:04.366849image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0104165
52.0%
1.096027
48.0%

Most occurring characters

ValueCountFrequency (%)
0304357
50.7%
.200192
33.3%
196027
 
16.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0304357
76.0%
196027
 
24.0%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0304357
50.7%
.200192
33.3%
196027
 
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0304357
50.7%
.200192
33.3%
196027
 
16.0%

Age
Real number (ℝ≥0)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.983016304
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-07-28T11:07:04.433523image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile13
Maximum13
Range12
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.748508006
Coefficient of variation (CV)0.5368035592
Kurtosis-1.226330109
Mean6.983016304
Median Absolute Deviation (MAD)3
Skewness0.01021103337
Sum1397944
Variance14.05131227
MonotonicityNot monotonic
2022-07-28T11:07:04.518284image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
217739
8.9%
1017122
8.6%
716686
8.3%
1315794
 
7.9%
315753
 
7.9%
1115731
 
7.9%
815398
 
7.7%
515363
 
7.7%
614987
 
7.5%
1214350
 
7.2%
Other values (3)41269
20.6%
ValueCountFrequency (%)
114280
7.1%
217739
8.9%
315753
7.9%
414096
7.0%
515363
7.7%
614987
7.5%
716686
8.3%
815398
7.7%
912893
6.4%
1017122
8.6%
ValueCountFrequency (%)
1315794
7.9%
1214350
7.2%
1115731
7.9%
1017122
8.6%
912893
6.4%
815398
7.7%
716686
8.3%
614987
7.5%
515363
7.7%
414096
7.0%

Education
Real number (ℝ≥0)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.402548553
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-07-28T11:07:04.602126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.747906292
Coefficient of variation (CV)0.5137050256
Kurtosis-1.319063595
Mean3.402548553
Median Absolute Deviation (MAD)2
Skewness0.02161135433
Sum681163
Variance3.055176405
MonotonicityNot monotonic
2022-07-28T11:07:04.684843image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
141698
20.8%
434662
17.3%
533245
16.6%
631038
15.5%
230283
15.1%
329266
14.6%
ValueCountFrequency (%)
141698
20.8%
230283
15.1%
329266
14.6%
434662
17.3%
533245
16.6%
631038
15.5%
ValueCountFrequency (%)
631038
15.5%
533245
16.6%
434662
17.3%
329266
14.6%
230283
15.1%
141698
20.8%

Income
Real number (ℝ≥0)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.356038203
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-07-28T11:07:04.764219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.226225965
Coefficient of variation (CV)0.511066676
Kurtosis-1.179795893
Mean4.356038203
Median Absolute Deviation (MAD)2
Skewness0.1265259176
Sum872044
Variance4.956082047
MonotonicityNot monotonic
2022-07-28T11:07:04.838307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
231527
15.7%
530528
15.2%
329104
14.5%
822324
11.2%
622219
11.1%
421630
10.8%
121568
10.8%
721292
10.6%
ValueCountFrequency (%)
121568
10.8%
231527
15.7%
329104
14.5%
421630
10.8%
530528
15.2%
622219
11.1%
721292
10.6%
822324
11.2%
ValueCountFrequency (%)
822324
11.2%
721292
10.6%
622219
11.1%
530528
15.2%
421630
10.8%
329104
14.5%
231527
15.7%
121568
10.8%

Diabetes_binary
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
1.0
100096 
0.0
100096 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters600576
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0100096
50.0%
0.0100096
50.0%

Length

2022-07-28T11:07:04.931268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-28T11:07:05.020202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0100096
50.0%
0.0100096
50.0%

Most occurring characters

ValueCountFrequency (%)
0300288
50.0%
.200192
33.3%
1100096
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number400384
66.7%
Other Punctuation200192
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0300288
75.0%
1100096
 
25.0%
Other Punctuation
ValueCountFrequency (%)
.200192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common600576
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0300288
50.0%
.200192
33.3%
1100096
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII600576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0300288
50.0%
.200192
33.3%
1100096
 
16.7%

Interactions

2022-07-28T11:06:58.694384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.236408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:55.089296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.307275image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.116721image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.905354image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.834519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.385199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:55.227119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.443020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.256416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.041527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.962130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.514894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:55.783753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.565264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.387468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.173357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:59.091331image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.662356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:55.914986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.693554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.509163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.302151image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:59.226941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.800799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.039450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.821459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.638290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.427270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:59.359996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:54.944310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.171797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:56.965354image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:57.769951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-07-28T11:06:58.557467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-07-28T11:07:05.114001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-28T11:07:05.339215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-28T11:07:05.555396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-28T11:07:05.770511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-28T11:07:05.952907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-28T11:06:59.574217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-28T11:07:00.138302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

HighBPHighCholCholCheckBMISmokerStrokeHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpAnyHealthcareNoDocbcCostGenHlthMentHlthPhysHlthDiffWalkSexAgeEducationIncomeDiabetes_binary
01.01.00.024.0045810.00.00.00.01.00.00.00.01.02.0-1.6876103.0485530.01.08.01.05.00.0
10.00.01.022.7116130.01.01.01.00.00.01.01.01.03.0-0.5606054.6617740.00.011.01.08.00.0
21.00.01.037.2500800.01.00.01.00.00.01.01.00.01.0-2.0269693.8968540.01.011.01.02.00.0
31.00.00.050.4168740.01.01.00.01.00.00.01.01.01.0-3.6631892.4115851.01.02.01.03.00.0
41.00.01.047.3753620.01.01.00.00.01.01.00.00.01.0-2.4948691.8387901.01.010.06.06.00.0
50.00.01.037.2744141.00.01.01.00.01.01.01.01.04.09.7080942.9611631.01.05.04.04.00.0
60.01.01.034.3428730.00.00.00.00.01.00.01.01.05.0-0.9823592.3649510.00.010.01.01.00.0
71.00.01.035.1194381.00.00.01.01.01.01.01.00.01.0-2.4621933.2778481.01.010.06.04.00.0
81.00.01.043.1296580.01.00.00.01.00.01.00.01.02.0-3.9149131.3260821.01.07.01.03.00.0
90.00.00.025.0138550.00.00.01.01.00.00.01.01.04.0-1.4049983.0425411.01.02.02.02.00.0

Last rows

HighBPHighCholCholCheckBMISmokerStrokeHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpAnyHealthcareNoDocbcCostGenHlthMentHlthPhysHlthDiffWalkSexAgeEducationIncomeDiabetes_binary
2001821.01.01.032.3544810.00.01.01.00.00.01.01.01.02.0-3.86832927.8806210.00.07.04.07.01.0
2001830.00.00.030.3796020.01.00.00.01.01.00.01.00.03.07.80505726.9464551.00.04.01.01.01.0
2001840.01.01.030.7163831.00.01.01.01.00.01.01.00.04.00.09156928.4740070.01.011.02.06.01.0
2001850.00.01.027.7604450.01.01.00.01.01.00.01.00.03.01.97364127.2883380.01.07.02.03.01.0
2001860.00.00.027.7862471.01.01.01.01.00.00.00.00.05.0-3.14659428.9784721.01.013.05.04.01.0
2001870.01.00.027.6027681.01.01.00.00.00.01.01.01.01.0-2.82884228.9481321.00.01.03.04.01.0
2001881.00.00.031.5936971.01.01.00.01.00.01.01.01.02.012.92114124.4268361.00.07.05.01.01.0
2001891.01.01.029.0472240.00.00.00.00.01.00.01.01.04.01.84137727.8018361.00.06.02.05.01.0
2001900.01.00.029.1855750.01.01.00.01.00.00.01.01.04.05.16476727.6093101.01.05.03.08.01.0
2001911.00.00.025.8876630.00.00.00.00.01.01.00.00.05.0-2.68244828.7469080.00.08.01.04.01.0