Dataset statistics
Number of variables | 22 |
---|---|
Number of observations | 200192 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 33.6 MiB |
Average record size in memory | 176.0 B |
Variable types
Categorical | 16 |
---|---|
Numeric | 6 |
BMI is highly correlated with PhysHlth and 1 other fields | High correlation |
PhysHlth is highly correlated with BMI and 1 other fields | High correlation |
Diabetes_binary is highly correlated with BMI and 1 other fields | High correlation |
BMI is highly correlated with PhysHlth and 1 other fields | High correlation |
PhysHlth is highly correlated with BMI and 1 other fields | High correlation |
Diabetes_binary is highly correlated with BMI and 1 other fields | High correlation |
BMI is highly correlated with Diabetes_binary | High correlation |
PhysHlth is highly correlated with Diabetes_binary | High correlation |
Diabetes_binary is highly correlated with BMI and 1 other fields | High correlation |
BMI is highly correlated with PhysHlth and 1 other fields | High correlation |
MentHlth is highly correlated with PhysHlth and 1 other fields | High correlation |
PhysHlth is highly correlated with BMI and 2 other fields | High correlation |
Diabetes_binary is highly correlated with BMI and 2 other fields | High correlation |
Diabetes_binary is uniformly distributed | Uniform |
Reproduction
Analysis started | 2022-07-28 05:36:30.623486 |
---|---|
Analysis finished | 2022-07-28 05:37:00.887424 |
Duration | 30.26 seconds |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
HighBP
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 108430 | |
1.0 | 91762 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 108430 | |
1.0 | 91762 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 308622 | |
. | 200192 | |
1 | 91762 | 15.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 308622 | |
1 | 91762 | 22.9% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 308622 | |
. | 200192 | |
1 | 91762 | 15.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 308622 | |
. | 200192 | |
1 | 91762 | 15.3% |
HighChol
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 111317 | |
1.0 | 88875 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 111317 | |
1.0 | 88875 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 311509 | |
. | 200192 | |
1 | 88875 | 14.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 311509 | |
1 | 88875 | 22.2% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 311509 | |
. | 200192 | |
1 | 88875 | 14.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 311509 | |
. | 200192 | |
1 | 88875 | 14.8% |
CholCheck
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 110920 | |
1.0 | 89272 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 110920 | |
1.0 | 89272 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 311112 | |
. | 200192 | |
1 | 89272 | 14.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 311112 | |
1 | 89272 | 22.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 311112 | |
. | 200192 | |
1 | 89272 | 14.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 311112 | |
. | 200192 | |
1 | 89272 | 14.9% |
Distinct | 197844 |
---|---|
Distinct (%) | 98.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 34.94536217 |
Minimum | 12.07818127 |
---|---|
Maximum | 73.7012558 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | 12.07818127 |
---|---|
5-th percentile | 23.08314381 |
Q1 | 27.07612228 |
median | 32.5464077 |
Q3 | 41.96646118 |
95-th percentile | 52.87893448 |
Maximum | 73.7012558 |
Range | 61.62307453 |
Interquartile range (IQR) | 14.8903389 |
Descriptive statistics
Standard deviation | 9.606681251 |
---|---|
Coefficient of variation (CV) | 0.2749057573 |
Kurtosis | -0.4092645864 |
Mean | 34.94536217 |
Median Absolute Deviation (MAD) | 6.634771347 |
Skewness | 0.6640821088 |
Sum | 6995781.944 |
Variance | 92.28832467 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
27.49985123 | 3 | < 0.1% |
28.71609306 | 3 | < 0.1% |
28.75860023 | 3 | < 0.1% |
53.49209213 | 3 | < 0.1% |
27.37580109 | 3 | < 0.1% |
34.54478073 | 3 | < 0.1% |
37.25175858 | 3 | < 0.1% |
53.23984909 | 3 | < 0.1% |
32.24931717 | 3 | < 0.1% |
28.67006111 | 3 | < 0.1% |
Other values (197834) | 200162 |
Value | Count | Frequency (%) |
12.07818127 | 1 | |
12.51285267 | 1 | |
12.7978735 | 1 | |
13.03032303 | 1 | |
13.05458164 | 1 | |
13.24094582 | 1 | |
13.49702835 | 1 | |
13.55658054 | 1 | |
13.73091984 | 1 | |
13.74115276 | 1 |
Value | Count | Frequency (%) |
73.7012558 | 1 | |
72.44902802 | 1 | |
70.99916077 | 1 | |
70.82411194 | 1 | |
70.80636597 | 1 | |
70.47361755 | 1 | |
70.2567215 | 1 | |
69.97668457 | 1 | |
69.97412109 | 1 | |
69.8751297 | 1 |
Smoker
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 111795 | |
1.0 | 88397 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 111795 | |
1.0 | 88397 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 311987 | |
. | 200192 | |
1 | 88397 | 14.7% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 311987 | |
1 | 88397 | 22.1% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 311987 | |
. | 200192 | |
1 | 88397 | 14.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 311987 | |
. | 200192 | |
1 | 88397 | 14.7% |
Stroke
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
1.0 | |
---|---|
0.0 |
Common Values
Value | Count | Frequency (%) |
1.0 | 104426 | |
0.0 | 95766 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1.0 | 104426 | |
0.0 | 95766 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 295958 | |
. | 200192 | |
1 | 104426 | 17.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 295958 | |
1 | 104426 | 26.1% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 295958 | |
. | 200192 | |
1 | 104426 | 17.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 295958 | |
. | 200192 | |
1 | 104426 | 17.4% |
HeartDiseaseorAttack
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 107304 | |
1.0 | 92888 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 107304 | |
1.0 | 92888 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 307496 | |
. | 200192 | |
1 | 92888 | 15.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 307496 | |
1 | 92888 | 23.2% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 307496 | |
. | 200192 | |
1 | 92888 | 15.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 307496 | |
. | 200192 | |
1 | 92888 | 15.5% |
PhysActivity
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
1.0 | |
---|---|
0.0 |
Common Values
Value | Count | Frequency (%) |
1.0 | 101184 | |
0.0 | 99008 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1.0 | 101184 | |
0.0 | 99008 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 299200 | |
. | 200192 | |
1 | 101184 | 16.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 299200 | |
1 | 101184 | 25.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 299200 | |
. | 200192 | |
1 | 101184 | 16.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 299200 | |
. | 200192 | |
1 | 101184 | 16.8% |
Fruits
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 108051 | |
1.0 | 92141 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 108051 | |
1.0 | 92141 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 308243 | |
. | 200192 | |
1 | 92141 | 15.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 308243 | |
1 | 92141 | 23.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 308243 | |
. | 200192 | |
1 | 92141 | 15.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 308243 | |
. | 200192 | |
1 | 92141 | 15.3% |
Veggies
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 107286 | |
1.0 | 92906 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 107286 | |
1.0 | 92906 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 307478 | |
. | 200192 | |
1 | 92906 | 15.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 307478 | |
1 | 92906 | 23.2% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 307478 | |
. | 200192 | |
1 | 92906 | 15.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 307478 | |
. | 200192 | |
1 | 92906 | 15.5% |
HvyAlcoholConsump
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 102411 | |
1.0 | 97781 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 102411 | |
1.0 | 97781 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 302603 | |
. | 200192 | |
1 | 97781 | 16.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 302603 | |
1 | 97781 | 24.4% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 302603 | |
. | 200192 | |
1 | 97781 | 16.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 302603 | |
. | 200192 | |
1 | 97781 | 16.3% |
AnyHealthcare
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
1.0 | |
---|---|
0.0 |
Common Values
Value | Count | Frequency (%) |
1.0 | 117896 | |
0.0 | 82296 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1.0 | 117896 | |
0.0 | 82296 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 282488 | |
. | 200192 | |
1 | 117896 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 282488 | |
1 | 117896 |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 282488 | |
. | 200192 | |
1 | 117896 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 282488 | |
. | 200192 | |
1 | 117896 |
NoDocbcCost
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 102772 | |
1.0 | 97420 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 102772 | |
1.0 | 97420 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 302964 | |
. | 200192 | |
1 | 97420 | 16.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 302964 | |
1 | 97420 | 24.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 302964 | |
. | 200192 | |
1 | 97420 | 16.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 302964 | |
. | 200192 | |
1 | 97420 | 16.2% |
GenHlth
Categorical
Distinct | 5 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
3.0 | |
---|---|
2.0 | |
4.0 | |
1.0 | |
5.0 |
Common Values
Value | Count | Frequency (%) |
3.0 | 43571 | |
2.0 | 42955 | |
4.0 | 42383 | |
1.0 | 40477 | |
5.0 | 30806 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
3.0 | 43571 | |
2.0 | 42955 | |
4.0 | 42383 | |
1.0 | 40477 | |
5.0 | 30806 |
Most occurring characters
Value | Count | Frequency (%) |
. | 200192 | |
0 | 200192 | |
3 | 43571 | 7.3% |
2 | 42955 | 7.2% |
4 | 42383 | 7.1% |
1 | 40477 | 6.7% |
5 | 30806 | 5.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 200192 | |
3 | 43571 | 10.9% |
2 | 42955 | 10.7% |
4 | 42383 | 10.6% |
1 | 40477 | 10.1% |
5 | 30806 | 7.7% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
. | 200192 | |
0 | 200192 | |
3 | 43571 | 7.3% |
2 | 42955 | 7.2% |
4 | 42383 | 7.1% |
1 | 40477 | 6.7% |
5 | 30806 | 5.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
. | 200192 | |
0 | 200192 | |
3 | 43571 | 7.3% |
2 | 42955 | 7.2% |
4 | 42383 | 7.1% |
1 | 40477 | 6.7% |
5 | 30806 | 5.1% |
Distinct | 198758 |
---|---|
Distinct (%) | 99.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -0.2835778274 |
Minimum | -13.49863529 |
---|---|
Maximum | 29.95352745 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 138377 |
Negative (%) | 69.1% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | -13.49863529 |
---|---|
5-th percentile | -6.880828428 |
Q1 | -3.188583553 |
median | -1.966302216 |
Q3 | 1.407046437 |
95-th percentile | 10.64880967 |
Maximum | 29.95352745 |
Range | 43.45216274 |
Interquartile range (IQR) | 4.59562999 |
Descriptive statistics
Standard deviation | 5.43507722 |
---|---|
Coefficient of variation (CV) | -19.16608668 |
Kurtosis | 4.19952029 |
Mean | -0.2835778274 |
Median Absolute Deviation (MAD) | 1.700447321 |
Skewness | 1.692974324 |
Sum | -56770.01242 |
Variance | 29.54006439 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
-3.533376932 | 3 | < 0.1% |
-2.893257141 | 3 | < 0.1% |
-2.343988895 | 3 | < 0.1% |
-2.378144741 | 3 | < 0.1% |
-3.299041986 | 3 | < 0.1% |
-2.256029367 | 3 | < 0.1% |
-2.96476841 | 3 | < 0.1% |
-2.132397175 | 3 | < 0.1% |
-2.849142551 | 3 | < 0.1% |
-1.449027419 | 3 | < 0.1% |
Other values (198748) | 200162 |
Value | Count | Frequency (%) |
-13.49863529 | 1 | |
-13.44561768 | 1 | |
-13.39755535 | 1 | |
-13.29034138 | 1 | |
-13.12091064 | 1 | |
-13.11013794 | 1 | |
-12.87581253 | 1 | |
-12.87448025 | 1 | |
-12.7059145 | 1 | |
-12.68376541 | 1 |
Value | Count | Frequency (%) |
29.95352745 | 1 | |
29.94845009 | 1 | |
29.9420166 | 1 | |
29.93429565 | 1 | |
29.92651367 | 1 | |
29.92457199 | 1 | |
29.92361641 | 1 | |
29.92251587 | 1 | |
29.91820717 | 1 | |
29.91556549 | 1 |
Distinct | 194381 |
---|---|
Distinct (%) | 97.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 14.95798922 |
Minimum | -13.73297882 |
---|---|
Maximum | 29.14030457 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 7984 |
Negative (%) | 4.0% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | -13.73297882 |
---|---|
5-th percentile | 0.2596646398 |
Q1 | 2.229513586 |
median | 19.80461884 |
Q3 | 27.96361399 |
95-th percentile | 28.52459154 |
Maximum | 29.14030457 |
Range | 42.87328339 |
Interquartile range (IQR) | 25.7341004 |
Descriptive statistics
Standard deviation | 12.84185559 |
---|---|
Coefficient of variation (CV) | 0.8585282022 |
Kurtosis | -1.937484814 |
Mean | 14.95798922 |
Median Absolute Deviation (MAD) | 9.069877625 |
Skewness | -0.029866479 |
Sum | 2994469.778 |
Variance | 164.9132551 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
27.92738724 | 5 | < 0.1% |
28.14529419 | 4 | < 0.1% |
28.09737778 | 4 | < 0.1% |
28.07441521 | 4 | < 0.1% |
28.61035156 | 4 | < 0.1% |
28.51515388 | 4 | < 0.1% |
27.87932396 | 4 | < 0.1% |
28.1064167 | 4 | < 0.1% |
28.0388279 | 4 | < 0.1% |
27.93398285 | 4 | < 0.1% |
Other values (194371) | 200151 |
Value | Count | Frequency (%) |
-13.73297882 | 1 | |
-13.3829298 | 1 | |
-13.08254147 | 1 | |
-11.80761623 | 1 | |
-11.66084671 | 1 | |
-11.36540604 | 1 | |
-11.27472973 | 1 | |
-10.7930069 | 1 | |
-10.74448013 | 1 | |
-10.62460327 | 1 |
Value | Count | Frequency (%) |
29.14030457 | 1 | |
29.13358688 | 1 | |
29.11510658 | 1 | |
29.10765266 | 1 | |
29.09693527 | 1 | |
29.09644318 | 1 | |
29.09359932 | 1 | |
29.08548546 | 1 | |
29.08523178 | 1 | |
29.08504868 | 1 |
DiffWalk
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
1.0 | |
---|---|
0.0 |
Common Values
Value | Count | Frequency (%) |
1.0 | 106670 | |
0.0 | 93522 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1.0 | 106670 | |
0.0 | 93522 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 293714 | |
. | 200192 | |
1 | 106670 | 17.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 293714 | |
1 | 106670 | 26.6% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 293714 | |
. | 200192 | |
1 | 106670 | 17.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 293714 | |
. | 200192 | |
1 | 106670 | 17.8% |
Sex
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
0.0 | |
---|---|
1.0 |
Common Values
Value | Count | Frequency (%) |
0.0 | 104165 | |
1.0 | 96027 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
0.0 | 104165 | |
1.0 | 96027 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 304357 | |
. | 200192 | |
1 | 96027 | 16.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 304357 | |
1 | 96027 | 24.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 304357 | |
. | 200192 | |
1 | 96027 | 16.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 304357 | |
. | 200192 | |
1 | 96027 | 16.0% |
Age
Real number (ℝ≥0)
Distinct | 13 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.983016304 |
Minimum | 1 |
---|---|
Maximum | 13 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 4 |
median | 7 |
Q3 | 10 |
95-th percentile | 13 |
Maximum | 13 |
Range | 12 |
Interquartile range (IQR) | 6 |
Descriptive statistics
Standard deviation | 3.748508006 |
---|---|
Coefficient of variation (CV) | 0.5368035592 |
Kurtosis | -1.226330109 |
Mean | 6.983016304 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.01021103337 |
Sum | 1397944 |
Variance | 14.05131227 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 17739 | |
10 | 17122 | |
7 | 16686 | |
13 | 15794 | 7.9% |
3 | 15753 | 7.9% |
11 | 15731 | 7.9% |
8 | 15398 | 7.7% |
5 | 15363 | 7.7% |
6 | 14987 | 7.5% |
12 | 14350 | 7.2% |
Other values (3) | 41269 |
Value | Count | Frequency (%) |
1 | 14280 | |
2 | 17739 | |
3 | 15753 | |
4 | 14096 | |
5 | 15363 | |
6 | 14987 | |
7 | 16686 | |
8 | 15398 | |
9 | 12893 | |
10 | 17122 |
Value | Count | Frequency (%) |
13 | 15794 | |
12 | 14350 | |
11 | 15731 | |
10 | 17122 | |
9 | 12893 | |
8 | 15398 | |
7 | 16686 | |
6 | 14987 | |
5 | 15363 | |
4 | 14096 |
Education
Real number (ℝ≥0)
Distinct | 6 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.402548553 |
Minimum | 1 |
---|---|
Maximum | 6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 3 |
Q3 | 5 |
95-th percentile | 6 |
Maximum | 6 |
Range | 5 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 1.747906292 |
---|---|
Coefficient of variation (CV) | 0.5137050256 |
Kurtosis | -1.319063595 |
Mean | 3.402548553 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 0.02161135433 |
Sum | 681163 |
Variance | 3.055176405 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 41698 | |
4 | 34662 | |
5 | 33245 | |
6 | 31038 | |
2 | 30283 | |
3 | 29266 |
Value | Count | Frequency (%) |
1 | 41698 | |
2 | 30283 | |
3 | 29266 | |
4 | 34662 | |
5 | 33245 | |
6 | 31038 |
Value | Count | Frequency (%) |
6 | 31038 | |
5 | 33245 | |
4 | 34662 | |
3 | 29266 | |
2 | 30283 | |
1 | 41698 |
Income
Real number (ℝ≥0)
Distinct | 8 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.356038203 |
Minimum | 1 |
---|---|
Maximum | 8 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.5 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 4 |
Q3 | 6 |
95-th percentile | 8 |
Maximum | 8 |
Range | 7 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.226225965 |
---|---|
Coefficient of variation (CV) | 0.511066676 |
Kurtosis | -1.179795893 |
Mean | 4.356038203 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 0.1265259176 |
Sum | 872044 |
Variance | 4.956082047 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 31527 | |
5 | 30528 | |
3 | 29104 | |
8 | 22324 | |
6 | 22219 | |
4 | 21630 | |
1 | 21568 | |
7 | 21292 |
Value | Count | Frequency (%) |
1 | 21568 | |
2 | 31527 | |
3 | 29104 | |
4 | 21630 | |
5 | 30528 | |
6 | 22219 | |
7 | 21292 | |
8 | 22324 |
Value | Count | Frequency (%) |
8 | 22324 | |
7 | 21292 | |
6 | 22219 | |
5 | 30528 | |
4 | 21630 | |
3 | 29104 | |
2 | 31527 | |
1 | 21568 |
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 11.5 MiB |
1.0 | |
---|---|
0.0 |
Common Values
Value | Count | Frequency (%) |
1.0 | 100096 | |
0.0 | 100096 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1.0 | 100096 | |
0.0 | 100096 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 300288 | |
. | 200192 | |
1 | 100096 | 16.7% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 400384 | |
Other Punctuation | 200192 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 300288 | |
1 | 100096 | 25.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 200192 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 600576 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 300288 | |
. | 200192 | |
1 | 100096 | 16.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 600576 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 300288 | |
. | 200192 | |
1 | 100096 | 16.7% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
HighBP | HighChol | CholCheck | BMI | Smoker | Stroke | HeartDiseaseorAttack | PhysActivity | Fruits | Veggies | HvyAlcoholConsump | AnyHealthcare | NoDocbcCost | GenHlth | MentHlth | PhysHlth | DiffWalk | Sex | Age | Education | Income | Diabetes_binary | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 1.0 | 0.0 | 24.004581 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 2.0 | -1.687610 | 3.048553 | 0.0 | 1.0 | 8.0 | 1.0 | 5.0 | 0.0 |
1 | 0.0 | 0.0 | 1.0 | 22.711613 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 3.0 | -0.560605 | 4.661774 | 0.0 | 0.0 | 11.0 | 1.0 | 8.0 | 0.0 |
2 | 1.0 | 0.0 | 1.0 | 37.250080 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | -2.026969 | 3.896854 | 0.0 | 1.0 | 11.0 | 1.0 | 2.0 | 0.0 |
3 | 1.0 | 0.0 | 0.0 | 50.416874 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | -3.663189 | 2.411585 | 1.0 | 1.0 | 2.0 | 1.0 | 3.0 | 0.0 |
4 | 1.0 | 0.0 | 1.0 | 47.375362 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | -2.494869 | 1.838790 | 1.0 | 1.0 | 10.0 | 6.0 | 6.0 | 0.0 |
5 | 0.0 | 0.0 | 1.0 | 37.274414 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 4.0 | 9.708094 | 2.961163 | 1.0 | 1.0 | 5.0 | 4.0 | 4.0 | 0.0 |
6 | 0.0 | 1.0 | 1.0 | 34.342873 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 5.0 | -0.982359 | 2.364951 | 0.0 | 0.0 | 10.0 | 1.0 | 1.0 | 0.0 |
7 | 1.0 | 0.0 | 1.0 | 35.119438 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | -2.462193 | 3.277848 | 1.0 | 1.0 | 10.0 | 6.0 | 4.0 | 0.0 |
8 | 1.0 | 0.0 | 1.0 | 43.129658 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 2.0 | -3.914913 | 1.326082 | 1.0 | 1.0 | 7.0 | 1.0 | 3.0 | 0.0 |
9 | 0.0 | 0.0 | 0.0 | 25.013855 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 4.0 | -1.404998 | 3.042541 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 0.0 |
Last rows
HighBP | HighChol | CholCheck | BMI | Smoker | Stroke | HeartDiseaseorAttack | PhysActivity | Fruits | Veggies | HvyAlcoholConsump | AnyHealthcare | NoDocbcCost | GenHlth | MentHlth | PhysHlth | DiffWalk | Sex | Age | Education | Income | Diabetes_binary | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
200182 | 1.0 | 1.0 | 1.0 | 32.354481 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 2.0 | -3.868329 | 27.880621 | 0.0 | 0.0 | 7.0 | 4.0 | 7.0 | 1.0 |
200183 | 0.0 | 0.0 | 0.0 | 30.379602 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 3.0 | 7.805057 | 26.946455 | 1.0 | 0.0 | 4.0 | 1.0 | 1.0 | 1.0 |
200184 | 0.0 | 1.0 | 1.0 | 30.716383 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 4.0 | 0.091569 | 28.474007 | 0.0 | 1.0 | 11.0 | 2.0 | 6.0 | 1.0 |
200185 | 0.0 | 0.0 | 1.0 | 27.760445 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 3.0 | 1.973641 | 27.288338 | 0.0 | 1.0 | 7.0 | 2.0 | 3.0 | 1.0 |
200186 | 0.0 | 0.0 | 0.0 | 27.786247 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | -3.146594 | 28.978472 | 1.0 | 1.0 | 13.0 | 5.0 | 4.0 | 1.0 |
200187 | 0.0 | 1.0 | 0.0 | 27.602768 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | -2.828842 | 28.948132 | 1.0 | 0.0 | 1.0 | 3.0 | 4.0 | 1.0 |
200188 | 1.0 | 0.0 | 0.0 | 31.593697 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 2.0 | 12.921141 | 24.426836 | 1.0 | 0.0 | 7.0 | 5.0 | 1.0 | 1.0 |
200189 | 1.0 | 1.0 | 1.0 | 29.047224 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 4.0 | 1.841377 | 27.801836 | 1.0 | 0.0 | 6.0 | 2.0 | 5.0 | 1.0 |
200190 | 0.0 | 1.0 | 0.0 | 29.185575 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 4.0 | 5.164767 | 27.609310 | 1.0 | 1.0 | 5.0 | 3.0 | 8.0 | 1.0 |
200191 | 1.0 | 0.0 | 0.0 | 25.887663 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 5.0 | -2.682448 | 28.746908 | 0.0 | 0.0 | 8.0 | 1.0 | 4.0 | 1.0 |