Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: remove duplicate words in strings and comments #11942

Merged
merged 1 commit into from
Apr 30, 2024
Merged

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Apr 30, 2024

Ref: #11940

I decided to do it wholesale instead of getting individual PRs to fix these.

Python script to find these:

import os
import re

def find_direct_duplicates(directory):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith(".go"):
                filepath = os.path.join(root, file)
                with open(filepath, 'r') as f:
                    lines = f.readlines()
                    all_duplicates = set()

                    for line in lines:
                        strings = re.findall(r'"([^"]*)"|`([^`]*)`', line)
                        texts = [text for text_group in strings for text in text_group if text]

                        comment_match = re.search(r"//(.+)", line)
                        if comment_match:
                            texts.append(comment_match.group(1))

                        for text in texts:
                            words = re.split(r'\s+', text.strip().lower())
                            duplicates = [words[i] for i in range(len(words) - 1)
                                          if words[i] == words[i + 1] and words[i].isalpha()]
                            all_duplicates.update(duplicates)

                    if all_duplicates:
                        print(f"Direct duplicates found in {filepath}: {', '.join(all_duplicates)}")

find_direct_duplicates('.')

(A collaborative effort with a couple of different LLMs).

Remainders are legit:

Direct duplicates found in ./cmd/lotus-miner/info_all.go: very
Direct duplicates found in ./curiosrc/proof/treed_build_test.go: ab
Direct duplicates found in ./extern/filecoin-ffi/bls_test.go: cats
Direct duplicates found in ./chain/events/state/predicates_test.go: to

@rvagg rvagg requested a review from rjan90 April 30, 2024 03:53
@rvagg rvagg merged commit cee77aa into master Apr 30, 2024
186 checks passed
@rvagg rvagg deleted the rvagg/nodupes branch April 30, 2024 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants