-
Notifications
You must be signed in to change notification settings - Fork 1
Evidence Verification Score (EVS)
This score measures the degree to which a belief has been verified from various forms of evidence. The score can consider the independence and quality of scientific studies, historical trends, social experiments, anecdotal evidence, and other relevant factors.
To calculate this score, we assess the relative strength of each "evidence" proposed as reasons to strengthen or weaken a belief. The score considers the quantity and similarity of scenarios tested, the number of replications, and the degree of similarity. The score also considers the quality of the studies or evidence, including bias, methodology, and sample size.
This weighting is a critical component of our evaluation process that helps us determine the reliability of different types of evidence. Our weighting algorithm assigns scores to each type of evidence based on their level of independence.
To ensure transparency, we have implemented two separate pro-con arguments with up/down votes and other measures to promote and measure the quality of arguments. This helps us determine our confidence level in the appropriateness of the chosen category for each piece of evidence.
The reliability rankings of different types of evidence, sorted from most to least reliable (based on our current scoring system), are as follows:
- Statistics and Data with links to sources
- Formal scientific studies and results from experiments or trials (Meta-analysis, Systematic review, Randomized controlled trial (double-blind, single-blind), Cohort study
- Case-control study, Cross-sectional study, Longitudinal study, Observational study, Correlational study, Experimental study, and Quasi-experimental study. Each should have links to the published results.)
- Proposed historical trends (with references to data from history) Expert testimony from relevant authorities, official documents, reports, and published claims (with evidence to support the causal relationships)
- Expert and social media claims
- Personal experience or anecdotal evidence
- Common sense or logical reasoning
- Analogies or metaphors
- Cultural or social norms
- Intuition or gut feeling (based on evolved or adaptive ethics and morals)
- News articles or media reports
- Survey data or public opinion polls
- Eye-witness testimony Visual evidence such as photographs or videos
- Historical artifacts or documents
Rest assured, we'll show you our math and provide complete transparency throughout our evaluation process, so you can understand how each piece of evidence is weighted and the impact it has on our overall conclusion.
Used to account for the number of times a study or experiment has been replicated. ##Evidence Replication Percentage (ERP) To illustrate the use of ERQ and ERP, let's consider a hypothetical scenario in which a study has been conducted multiple times to examine the effects of a certain medication on a particular disease. The ERQ would take into account the number of times the study has been replicated, while the ERP would measure the percentage of replications that have produced similar results. By using these metrics, we can more accurately evaluate the reliability of the evidence and make informed decisions based on the strength of the supporting evidence and the reliability of the data.
Introducing the Evidence-to-Conclusion Relevance Score (ECRS) - a key metric for the open internet evaluation process. The ECRS is the score given to the relevance of the evidence presented as reasons to support or oppose different conclusions. This score is calculated based on the performance of pro/con sub-arguments that the evidence would necessarily prove the conclusion if, for example, the evidence were infinitely replicated by double-blind scientific methods. Don't worry; we won't leave you wondering how this score is calculated. We'll show you our math and provide complete transparency throughout our evaluation process.
evidence_categories = {
"statistics_and_data": 0.9,
"formal_scientific_studies_randomized_controlled_trials": 0.85,
"formal_scientific_studies_meta_analysis": 0.8,
"formal_scientific_studies_observation_studies": 0.75,
"proposed_historical_trends": 0.7,
"expert_testimony": 0.65,
"expert_and_social_media_claims": 0.6,
"personal_experience": 0.55,
"common_sense": 0.5,
"analogies_or_metaphors": 0.45,
"cultural_or_social_norms": 0.4,
"intuition_or_gut_feeling": 0.35,
"news_articles_or_media_reports": 0.3,
"survey_data_or_public_opinion_polls": 0.25,
"eye_witness_testimony": 0.2,
"visual_evidence": 0.15,
"historical_artifacts_or_documents": 0.1
}
# Define the evidence replication percentages for each piece of evidence
evidence_replication_percentages = {
"study_1": 90,
"study_2": 95,
"study_3": 85
}
# Define the evidence replication quantities for each piece of evidence
evidence_replication_quantities = {
"study_1": 5,
"study_2": 10,
"study_3": 3
}
# Define the Evidence-to-Conclusion Relevance Score (ECRS) for each piece of evidence
evidence_ecrs = {
"study_1": 0.8,
"study_2": 0.9,
"study_3": 0.7
}
# Calculate the Category Weighting (ESIW) for each piece of evidence
evidence_esiw = {}
for evidence in evidence_categories:
esiw = evidence_categories[evidence]
evidence_esiw[evidence] = esiw
# Calculate the Evidence Verification Score (EVS) for each piece of evidence
evidence_evs = {}
for evidence in evidence_categories:
evs = (
evidence_esiw[evidence] *
evidence_ecrs[evidence] *
evidence_replication_quantities[evidence] *
evidence_replication_percentages[evidence] / 100
)
evidence_evs[evidence] = evs
# Calculate the overall Evidence Verification Score (EVS) for the belief
belief_evs = sum(evidence_evs.values())
print("Overall Evidence Verification Score (EVS):", belief_evs)
Here is the type of code that could provide scores for each type of evidence:
# Sample data
arguments = [
{
"id": 1,
"text": "Statistics and data are the most important type of evidence",
"pro": True,
"scores": [8, 7, 9, 6, 8]
},
{
"id": 2,
"text": "There are other types of evidence that are equally important",
"pro": False,
"scores": [5, 6, 7, 4, 7]
},
# Add more arguments here...
]
# Calculate the sum of scores for the pro arguments that agree that statistics and data are important
pro_sum = sum(arg["scores"][-1] for arg in arguments if arg["pro"] and arg["scores"][-1] >= 7)
# Calculate the sum of scores for the con arguments that disagree that statistics and data are important
con_sum = sum(arg["scores"][-1] for arg in arguments if not arg["pro"] and arg["scores"][-1] < 7)
# Calculate the statistics_and_data value as the ratio of the pro sum to the con sum
if con_sum != 0:
statistics_and_data = pro_sum / con_sum
else:
statistics_and_data = 1.0 # If there are no con arguments, assume a perfect score
print("The statistics_and_data value is:", statistics_and_data)