Author Topic: Particle count on multiple selected clusters  (Read 90 times)

shanis

  • Newbie
  • *
  • Posts: 8
Particle count on multiple selected clusters
« on: May 09, 2019, 07:16:46 PM »
Dear Ovito users,

I am trying to do particle count by type of only specific clusters (lets say 3 out of 5 clusters). Here is the code I use to do the counting. However, currently the counting is executed on all clusters. Let me know how I can perform the counting only on selected clusters.

Code: [Select]
cluster = ClusterAnalysisModifier(sort_by_size=True)
pipeline.modifiers.append(cluster)
data=pipeline.compute()

num_particles = data.particle.count

def particle_counting(frame, data):
    cluster_sizes = np.zeros(num_particles, dtype=int)

    for pidx in range(num_particles):
        ptype = data.particles_['Particle Type'][pidx]

        if data.particles['Particle Type'].type_by_id(ptype).name == 'C':
            cluster_sizes[data.particles['Cluster'][pidx]] += 1

     data.particles_.create_property('ClusterSize_', data = cluster_sizes)
     data.attributes['ClusterSize'] = list(cluster_sizes)
« Last Edit: May 09, 2019, 07:37:56 PM by Constanze Kalcher »

Constanze Kalcher

  • Administrator
  • Sr. Member
  • *****
  • Posts: 259
Re: Particle count on multiple selected clusters
« Reply #1 on: May 10, 2019, 12:24:28 PM »
Dear shanis,

in your above code snippet, you could create a selection based on the cluster ID's and then only loop over particles that have a particle property selection == 1.

Alternatively, you could simply use numpy.bincount() to calculate the number of C atoms in each cluster:
Code: [Select]
def particle_counting(frame, data):
    ptype = data.particles['Particle Type']
    C_type_id = type_property.get_type_by_name('C').id
    cluster_sizes = numpy.bincount( data.particles['Cluster'][ptype == C_type_id])
Since you initially sorted them by size you know which entry corresponds to which Cluster-ID.

Code: [Select]
    cluster_size_property = data.particles_.create_property("ClusterSize", dtype = int, components = 1)
    with cluster_size_property:
        for i in range(len(cluster_sizes)): #Or adapt to your needs...the first three clusters e.g. would be range(1,4)
            cluster_size_property[(ptype == C_type_id ) & ( data.particles['Cluster'] == i) ] = cluster_sizes[i]

-Constanze
« Last Edit: May 10, 2019, 03:47:51 PM by Constanze Kalcher »

shanis

  • Newbie
  • *
  • Posts: 8
Re: Particle count on multiple selected clusters
« Reply #2 on: May 16, 2019, 10:01:20 PM »
Thanks Constanz.

Now I am trying to compile, for each frame, the values of ClusterSize when it evaluates to greater than a cut off, lets say when the size is > 100, and export all the values for each frame into a single text file (n_columns = number of frames, n_rows = maximum number of clusters ). I am guessing I can do,

pipeline.modifiers.append(SelectExpressionModifier(expression= 'ClusterSize >100) )

and somehow loop over each frame while specifying selection == 1, but I am not sure how to do it exactly at this moment.

Let me know if you have suggestions.

Constanze Kalcher

  • Administrator
  • Sr. Member
  • *****
  • Posts: 259
Re: Particle count on multiple selected clusters
« Reply #3 on: May 17, 2019, 10:44:02 AM »
Hi shanis,

if you use the Expression selection modifier with the expression you posted above, you will select all carbon atoms that belong to clusters which contain more than 100 carbon atoms. You could then continue from there by counting the individual Cluster-ID's (i.e. the number of clusters) within that particle selection, e.g. using numpy.unique()

Code: [Select]
selection = data.particles['Selection']
cluster_ids = data.particles['Cluster']
cluster_count = len( numpy.unique( cluster_ids [ selection == 1] ))
data.attributes["My_cluster_count"] = cluster_count

For each frame, you could then store that information as a global attribute which in the end can be exported using the File export function "txt/attr", e.g.

Code: [Select]
export_file(pipeline, "cluster_count.txt", "txt/attr", multiple_frames = True,
         columns = ["Frame", "My_cluster_count"])

-Constanze
« Last Edit: May 17, 2019, 11:01:11 AM by Constanze Kalcher »

shanis

  • Newbie
  • *
  • Posts: 8
Re: Particle count on multiple selected clusters
« Reply #4 on: May 17, 2019, 03:57:42 PM »
Thanks a lot Constanze,

This is helpful to count clusters with certain sizes. However, I also want to output the size of each cluster that has been counted, not just the counts. The hint I have is to express the ClusterSize property as an array, and have a nested loop for each frame and each selected cluster to output the cluster size, but this may be very inefficient given that there are lots of frames to be analyzed.

Let me know if you have any suggestions for alternatives.

Best

Constanze Kalcher

  • Administrator
  • Sr. Member
  • *****
  • Posts: 259
Re: Particle count on multiple selected clusters
« Reply #5 on: May 17, 2019, 05:24:40 PM »
Hi shanis,

do I understand you correctly that you want to export all individual cluster ID's along with your custom property "Cluster size"?
In that case, if you have used numpy.bincount() to calculate the number of C-atoms in your different clusters
Code: [Select]
cluster_sizes = numpy.bincount( data.particles['Cluster'][ptype == C_type_id])
you already got the information you would like to export. "cluster_sizes" is an ndarray of length max(Cluster ID's) +1 that contains a bincount (i.e. the cluster sizes) of each
occuring Cluster ID. You could use numpy.where() to filter out clusters smaller or larger than a certain size
and then e.g. numpy.savetxt() to export this to a textfile.

-Constanze