statistics - Sorting data via normal distribution in python without using a third party library - Stack Overflow

i have data list where the first index is just the name and the second index is the height of the bar,

i have data list where the first index is just the name and the second index is the height of the bar, i want to sort it via normal distribution but im not sure what im doing wrong

def normaldistribution(data):
    print(f"data list: {data}")
    
    # Extract heights
    heights = [item[1] for item in data]

    # Compute mean (μ) and standard deviation (σ)
    mean = sum(heights) / len(heights)
    variance = sum((x - mean) ** 2 for x in heights) / len(heights)
    std_dev = math.sqrt(variance) if variance > 0 else 1  # Prevent division by zero

    # Print calculated mean, variance, std deviation
    print(f"mean: {mean} variance: {variance} std dev {std_dev}")

    # Compute absolute distance from mean
    data_with_distance = [(name, height, abs(height - mean), index) for index, (name, height) in enumerate(data)]

    print(f"data with distance: {data_with_distance}")
    
    # Sort by:
    # 1. Absolute distance from mean (smallest first)
    # 2. Original order (to maintain stability)
    sorted_data = sorted(data_with_distance, key=lambda x: (x[2], x[3]))

    # Remove extra values and keep only (name, height) pairs
    sorted_data = [[name, height] for name, height, _, _ in sorted_data]

    # Print sorted data
    print(sorted_data)
    return sorted_data

My output looks like this :

data list: [[1, 5], [2, 4], [8, 1], [22, 8], [24, 2], [46, 1]]
mean: 3.5 variance: 6.25 std dev 2.5
data with distance: [(1, 5, 1.5, 0), (2, 4, 0.5, 1), (8, 1, 2.5, 2), (22, 8, 4.5, 3), (24, 2, 1.5, 4), (46, 1, 2.5, 5)]
[[2, 4], [1, 5], [24, 2], [8, 1], [46, 1], [22, 8]]
data list: [[1, 18], [2, 6], [3, 3], [4, 1]]
mean: 7.0 variance: 43.5 std dev 6.59545297913646
data with distance: [(1, 18, 11.0, 0), (2, 6, 1.0, 1), (3, 3, 4.0, 2), (4, 1, 6.0, 3)]
[[2, 6], [3, 3], [4, 1], [1, 18]]

Please dont mention any third party libraries , they are completely useless to me as i cannot use them at all

In my output i keep getting my highest value as my last value and im struggling to understand why

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745218364a4617130.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信