Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse more data #11

Open
kjappelbaum opened this issue May 29, 2019 · 0 comments
Open

Parse more data #11

kjappelbaum opened this issue May 29, 2019 · 0 comments
Assignees
Labels
enhancement feature feature request

Comments

@kjappelbaum
Copy link
Contributor

kjappelbaum commented May 29, 2019

I think if the parser has the file already open, it can parse some additional data.

  1. Grep for 'WARNING' and put it output parameters and the report. I currently do something like
 if "WARNING" in line:
     self.logger.warning(line)
     warnings.append(line.split('WARNING:')[-1])

when we already loop over the file.

  1. For debugging and development of new sampling strategies it is useful to have information about the MC efficiencies, which are of course only printed if the move is also used. Otherwise there will be be OFF in the line. I currently use something like, which is ugly but seems to work
def parse_performance_line(line):
    """
    :param line: string in format
        Component [tip4pew] total tried: 386.000000 succesfull growth: 3.000000 (0.777202 [%]) accepted: 3.000000 (0.777202 [%])
        that will be parsed
    :return:
    """
    parts = line.split()
    total_tried = parts[4]
    successfull_growth_total = parts[7]
    successfull_growth_ratio = parts[8].strip('(')
    accepted_total = parts[-3]
    accepted_ratio = parts[-2].strip('(')
    return {
        'total_tried': int(float(total_tried)),
        'successfull_growth_total': int(float(successfull_growth_total)),
        'successfull_growth_ratio': float(successfull_growth_ratio),
        'accepted_total': int(float(accepted_total)),
        'accepted_ratio': float(accepted_ratio)
    }


def parse_performance_block(lines):
    """
    :param lines: list of strings in format
        	total        1.000000 1.000000 2.000000
            succesfull   1.000000 1.000000 1.000000
            accepted   1.000000 1.000000 0.500000
            displacement 0.487500 0.487500 0.325000
    :return:
    """
    import numpy as np
    totals = [int(float(i)) for i in lines[4].split()[1:]]
    successfull = [int(float(i)) for i in lines[3].split()[1:]]
    acceptance_ratio = [float(i) for i in lines[2].split()[1:]]
    drift = [float(i) for i in lines[1].split()[1:]]
    return {
        'total': totals,
        'successfull': successfull,
        'acceptance_ratio': acceptance_ratio,
        'drift': drift,
        'acceptance_ratio_mean': np.mean(acceptance_ratio)
    }


def parse_performance_mc(f):
    """
    Parse for some performance metrics of the MC moves
    :param f: file as lines
    :return: dictionary with efficiency of MC efficiencies
    """
    efficiencies_dict = {}
    # read from end for efficiency:
    for i, line in enumerate(f[::-1]):
        if ('Performance of the Reinsertion move:' in line) and  ('OFF' not in line):
            efficiencies_dict['reinsertion'] = parse_performance_line(
                f[::-1][i - 2])
        if ('Performance of the swap deletion move:' in line) and ('OFF' not in line):
            efficiencies_dict['deletion'] = parse_performance_line(f[::-1][i -
                                                                           2])
        if ('Performance of the swap addition move:' in line) and ('OFF' not in line):
            efficiencies_dict['addition'] = parse_performance_line(f[::-1][i -
                                                                           2])
        if ('Performance of the rotation move:' in line)  and ('OFF' not in line):
            efficiencies_dict['rotation'] = parse_performance_block(
                f[::-1][i - 7:i - 2])

        if ('Performance of the translation move:' in line) and ('OFF' not in line):
            efficiencies_dict['translation'] = parse_performance_block(
                f[::-1][i - 7:i - 2])

        if 'Monte-Carlo moves statistics' in line:
            break

    return efficiencies_dict
  1. Also, I would parse all energies that are printed, i.e. also tail-correction energy
  2. For MD, I would be interested in stuff like the energy drift
  3. I somehow feel that it would also be good to store some calculation settings like EnergyOverlapCriteria which appears to be different in the manual and the default to ensure reproducibility across different versions of RASPA (maybe also simply line 3 in the output, which is the RASPA version).

I will add additional energies that are always there to this issue. Maybe @danieleongari can also give some of his experience.

@yakutovicha yakutovicha self-assigned this May 29, 2019
@yakutovicha yakutovicha added enhancement feature feature request labels May 29, 2019
@yakutovicha yakutovicha changed the title [Feature Request] Parse more data Parse more data May 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature feature request
Projects
None yet
Development

No branches or pull requests

2 participants