Skip to content

Instantly share code, notes, and snippets.

@CrazyDaffodils
Last active January 13, 2020 23:33
Show Gist options
  • Select an option

  • Save CrazyDaffodils/08897de11a9f0be0ce027d87f7ad4264 to your computer and use it in GitHub Desktop.

Select an option

Save CrazyDaffodils/08897de11a9f0be0ce027d87f7ad4264 to your computer and use it in GitHub Desktop.

Revisions

  1. CrazyDaffodils revised this gist Jan 13, 2020. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions pca
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,5 @@
    from sklearn.decomposition import PCA
    import seaborn as sns
    #Visualize data using Principal Component Analysis.
    print("Principal Component Analysis (PCA)")
    pca = PCA(n_components = 2).fit_transform(X_std)
  2. CrazyDaffodils revised this gist Jan 13, 2020. 1 changed file with 1 addition and 13 deletions.
    14 changes: 1 addition & 13 deletions pca
    Original file line number Diff line number Diff line change
    @@ -4,16 +4,4 @@ pca = PCA(n_components = 2).fit_transform(X_std)
    pca_df = pd.DataFrame(data=pca, columns=['PC1','PC2']).join(labels)
    palette = sns.color_palette("muted", n_colors=5)
    sns.set_style("white")
    sns.scatterplot(x='PC1',y='PC2',hue='Class',data=pca_df, palette=palette, linewidth=0.2, s=30, alpha=1).set_title('PCA')

    #Fitting PCA on Data
    print("Explained Variance of PCA components")
    pca_std = PCA().fit(X_std)
    percent_variance=pca_std.explained_variance_ratio_*100
    #Plotting Cumulative Summation of the Explained Variance
    plt.figure()
    plt.plot(np.cumsum(pca_std.explained_variance_ratio_))
    plt.xlabel('Number of Components')
    plt.ylabel('Variance (%)') #for each component
    plt.title('Cancer Dataset - Cumulative Explained Variance')
    plt.show()
    sns.scatterplot(x='PC1',y='PC2',hue='Class',data=pca_df, palette=palette, linewidth=0.2, s=30, alpha=1).set_title('PCA')
  3. CrazyDaffodils revised this gist Jan 13, 2020. 1 changed file with 13 additions and 1 deletion.
    14 changes: 13 additions & 1 deletion pca
    Original file line number Diff line number Diff line change
    @@ -4,4 +4,16 @@ pca = PCA(n_components = 2).fit_transform(X_std)
    pca_df = pd.DataFrame(data=pca, columns=['PC1','PC2']).join(labels)
    palette = sns.color_palette("muted", n_colors=5)
    sns.set_style("white")
    sns.scatterplot(x='PC1',y='PC2',hue='Class',data=pca_df, palette=palette, linewidth=0.2, s=30, alpha=1).set_title('PCA')
    sns.scatterplot(x='PC1',y='PC2',hue='Class',data=pca_df, palette=palette, linewidth=0.2, s=30, alpha=1).set_title('PCA')

    #Fitting PCA on Data
    print("Explained Variance of PCA components")
    pca_std = PCA().fit(X_std)
    percent_variance=pca_std.explained_variance_ratio_*100
    #Plotting Cumulative Summation of the Explained Variance
    plt.figure()
    plt.plot(np.cumsum(pca_std.explained_variance_ratio_))
    plt.xlabel('Number of Components')
    plt.ylabel('Variance (%)') #for each component
    plt.title('Cancer Dataset - Cumulative Explained Variance')
    plt.show()
  4. CrazyDaffodils created this gist Jan 13, 2020.
    7 changes: 7 additions & 0 deletions pca
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@
    #Visualize data using Principal Component Analysis.
    print("Principal Component Analysis (PCA)")
    pca = PCA(n_components = 2).fit_transform(X_std)
    pca_df = pd.DataFrame(data=pca, columns=['PC1','PC2']).join(labels)
    palette = sns.color_palette("muted", n_colors=5)
    sns.set_style("white")
    sns.scatterplot(x='PC1',y='PC2',hue='Class',data=pca_df, palette=palette, linewidth=0.2, s=30, alpha=1).set_title('PCA')