k-means clustering

[작성중]
이걸 써서 푼 문제는 17978이 유일한데, 이것도 사실 정해는 아니다.
k-means clustering으로 나눠진 decision boundary는 linear하다
- 보로노이 다이어그램이 만들어진다
알고리즘은 다양하다
- https://towardsdatascience.com/three-versions-of-k-means-cf939b65f4ea
- 가장 기본적인 것은 Lloyd's algorithm이다
- 수렴 속도를 빠르게 하기 위해서 다음의 방법들이 나왔다.
  - MacQueen’s k-Means
  - Hartingan-Wong k-Means

구현

3차원 좌표계, k=2 일때

def k_means(points, init_centeroids):
    """LLoyd's algorithm on 3d-points."""
    dist = INF
    (x0, y0, z0), (x1, y1, z1) = init_centeroids
    for _ in range(MAX_ITER):
        prev_dist, dist = dist, 0.0
        cluster0, cluster1 = [0, 0, 0], [0, 0, 0]
        count0 = count1 = 0
        for x, y, z in points:
            d0 = (x0 - x) * (x0 - x) + (y0 - y) * (y0 - y) + (z0 - z) * (z0 - z)
            d1 = (x1 - x) * (x1 - x) + (y1 - y) * (y1 - y) + (z1 - z) * (z1 - z)
            if d0 < d1:
                dist += d0
                count0 += 1
                c = cluster0
            else:
                dist += d1
                count1 += 1
                c = cluster1
            c[0] += x
            c[1] += y
            c[2] += z
        if abs(dist - prev_dist) < EPS:
            break

        x0, y0, z0 = [x / count0 for x in cluster0]
        x1, y1, z1 = [x / count1 for x in cluster1]

    return dist

사용된 문제 : Washer

토론

실명:

이메일:

웹사이트:

댓글을 입력하세요:

댓글 구독