Parse more data #11

kjappelbaum · 2019-05-29T11:00:49Z

I think if the parser has the file already open, it can parse some additional data.

Grep for 'WARNING' and put it output parameters and the report. I currently do something like

 if "WARNING" in line:
     self.logger.warning(line)
     warnings.append(line.split('WARNING:')[-1])

when we already loop over the file.

For debugging and development of new sampling strategies it is useful to have information about the MC efficiencies, which are of course only printed if the move is also used. Otherwise there will be be OFF in the line. I currently use something like, which is ugly but seems to work

def parse_performance_line(line):
    """
    :param line: string in format
        Component [tip4pew] total tried: 386.000000 succesfull growth: 3.000000 (0.777202 [%]) accepted: 3.000000 (0.777202 [%])
        that will be parsed
    :return:
    """
    parts = line.split()
    total_tried = parts[4]
    successfull_growth_total = parts[7]
    successfull_growth_ratio = parts[8].strip('(')
    accepted_total = parts[-3]
    accepted_ratio = parts[-2].strip('(')
    return {
        'total_tried': int(float(total_tried)),
        'successfull_growth_total': int(float(successfull_growth_total)),
        'successfull_growth_ratio': float(successfull_growth_ratio),
        'accepted_total': int(float(accepted_total)),
        'accepted_ratio': float(accepted_ratio)
    }


def parse_performance_block(lines):
    """
    :param lines: list of strings in format
        	total        1.000000 1.000000 2.000000
            succesfull   1.000000 1.000000 1.000000
            accepted   1.000000 1.000000 0.500000
            displacement 0.487500 0.487500 0.325000
    :return:
    """
    import numpy as np
    totals = [int(float(i)) for i in lines[4].split()[1:]]
    successfull = [int(float(i)) for i in lines[3].split()[1:]]
    acceptance_ratio = [float(i) for i in lines[2].split()[1:]]
    drift = [float(i) for i in lines[1].split()[1:]]
    return {
        'total': totals,
        'successfull': successfull,
        'acceptance_ratio': acceptance_ratio,
        'drift': drift,
        'acceptance_ratio_mean': np.mean(acceptance_ratio)
    }


def parse_performance_mc(f):
    """
    Parse for some performance metrics of the MC moves
    :param f: file as lines
    :return: dictionary with efficiency of MC efficiencies
    """
    efficiencies_dict = {}
    # read from end for efficiency:
    for i, line in enumerate(f[::-1]):
        if ('Performance of the Reinsertion move:' in line) and  ('OFF' not in line):
            efficiencies_dict['reinsertion'] = parse_performance_line(
                f[::-1][i - 2])
        if ('Performance of the swap deletion move:' in line) and ('OFF' not in line):
            efficiencies_dict['deletion'] = parse_performance_line(f[::-1][i -
                                                                           2])
        if ('Performance of the swap addition move:' in line) and ('OFF' not in line):
            efficiencies_dict['addition'] = parse_performance_line(f[::-1][i -
                                                                           2])
        if ('Performance of the rotation move:' in line)  and ('OFF' not in line):
            efficiencies_dict['rotation'] = parse_performance_block(
                f[::-1][i - 7:i - 2])

        if ('Performance of the translation move:' in line) and ('OFF' not in line):
            efficiencies_dict['translation'] = parse_performance_block(
                f[::-1][i - 7:i - 2])

        if 'Monte-Carlo moves statistics' in line:
            break

    return efficiencies_dict

Also, I would parse all energies that are printed, i.e. also tail-correction energy
For MD, I would be interested in stuff like the energy drift
I somehow feel that it would also be good to store some calculation settings like EnergyOverlapCriteria which appears to be different in the manual and the default to ensure reproducibility across different versions of RASPA (maybe also simply line 3 in the output, which is the RASPA version).

I will add additional energies that are always there to this issue. Maybe @danieleongari can also give some of his experience.

The text was updated successfully, but these errors were encountered:

yakutovicha self-assigned this May 29, 2019

yakutovicha added enhancement feature feature request labels May 29, 2019

yakutovicha changed the title ~~[Feature Request] Parse more data~~ Parse more data May 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse more data #11

Parse more data #11

kjappelbaum commented May 29, 2019 •

edited

Loading

Parse more data #11

Parse more data #11

Comments

kjappelbaum commented May 29, 2019 • edited Loading

kjappelbaum commented May 29, 2019 •

edited

Loading