Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarize parts #7

Open
wants to merge 63 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
bb81e1f
don't support namespace
wangfenjin Nov 29, 2023
39cb0bb
fix UI bug
wangfenjin Nov 29, 2023
b3fb043
pass metadata
wangfenjin Nov 29, 2023
f3c8511
feat: add page no and line info into the resp
Nov 29, 2023
7100ad7
fix: type error
Nov 29, 2023
caab397
feat: add translate api
Nov 29, 2023
4300d0e
feat: add translate lang
Nov 29, 2023
d6da379
add translation prompt
wangfenjin Nov 29, 2023
ae04fb9
feat: frontend layout and pdf loading
Nov 29, 2023
7f0c129
indexing template
hakanoganalpar1 Nov 29, 2023
f4e4f6d
Merge branch 'dev' into index_changes
Nov 29, 2023
972cd11
wip
hakanoganalpar1 Nov 29, 2023
b8902a2
index changes can be pushed, agent instances are seperated
hakanoganalpar1 Nov 29, 2023
ee102aa
feat: annotation layout
Nov 29, 2023
adf3054
Merge pull request #2 from ikkyu-ai/dev
BenChen99 Nov 30, 2023
4e1e483
using only 1 agent right now
hakanoganalpar1 Nov 30, 2023
f35aa64
cleanup
hakanoganalpar1 Nov 30, 2023
630ebd3
merged master
hakanoganalpar1 Nov 30, 2023
c1b2f65
Merge pull request #3 from ikkyu-ai/index_changes
BenChen99 Nov 30, 2023
7891abf
feat: load pdf to document assistant
Nov 30, 2023
9413bfd
Merge branch 'master' into load_pdf
Nov 30, 2023
cef8497
fix: code
Nov 30, 2023
289e61e
Merge pull request #4 from ikkyu-ai/load_pdf
rlrh Nov 30, 2023
07d425f
feat: chat area layoutas
Nov 30, 2023
b5bccc9
Revert "feat: load pdf to document assistant"
hakanoganalpar1 Nov 30, 2023
0be8161
feat: annotation interaction
Nov 30, 2023
4dfc87e
Merge pull request #5 from ikkyu-ai/revert-4-load_pdf
rlrh Nov 30, 2023
82ce5b0
Revert "Index changes"
hakanoganalpar1 Nov 30, 2023
3b477c9
Merge pull request #6 from ikkyu-ai/revert-3-index_changes
hakanoganalpar1 Nov 30, 2023
a91f349
feat: hide scroll bar
Nov 30, 2023
f0c233b
wip
hakanoganalpar1 Nov 30, 2023
f3211be
check for index key
hakanoganalpar1 Nov 30, 2023
0d66336
feat: integrate with pinecone
Nov 30, 2023
0d42f6d
Merge pull request #7 from ikkyu-ai/index_creation_api
hakanoganalpar1 Nov 30, 2023
6f9b916
fix: key
Nov 30, 2023
d13157b
fix: translation error
Nov 30, 2023
8350f8a
fix: translate
Nov 30, 2023
8e17a97
changed delay logic
hakanoganalpar1 Nov 30, 2023
aedc983
Merge pull request #8 from ikkyu-ai/change_delay_logic
hakanoganalpar1 Nov 30, 2023
2a65ce6
fix: chat messages bug
Nov 30, 2023
f94a53f
fixed uploads
hakanoganalpar1 Nov 30, 2023
ebfe249
Merge pull request #9 from ikkyu-ai/upload_fixes
hakanoganalpar1 Nov 30, 2023
016670d
feat: store chat histories
Nov 30, 2023
984640f
feat: add summary feature
Nov 30, 2023
4c20f8e
fix: locate annocation when clicked
Nov 30, 2023
ea9315c
fix: import errir
Nov 30, 2023
1436629
returning more from vector db
hakanoganalpar1 Nov 30, 2023
bfe51bc
Merge pull request #10 from ikkyu-ai/improve_results
hakanoganalpar1 Nov 30, 2023
e462072
feat: upgrade the query sentence
Nov 30, 2023
e913674
translate prompt
wangfenjin Nov 30, 2023
a646aa1
fix quote
wangfenjin Nov 30, 2023
6d76afd
feat: set summary
Nov 30, 2023
4c6a907
feat: add full summary display
Nov 30, 2023
c11b2ff
handle phrase seperately
hakanoganalpar1 Nov 30, 2023
2bad1c8
feat: annotation link logic
Nov 30, 2023
67c4535
feat: upgrade the query sentence
Nov 30, 2023
8331dd7
translate prompt
wangfenjin Nov 30, 2023
7f35c14
fix quote
wangfenjin Nov 30, 2023
22dfb3a
feat: set summary
Nov 30, 2023
4d3c514
feat: add full summary display
Nov 30, 2023
d1c257e
feat: annotation link logic
Nov 30, 2023
21784a8
feat: frontend changes
Nov 30, 2023
4523e17
merge conf resolved
hakanoganalpar1 Nov 30, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,15 @@ Built with:

All commands are run from the root of the project, from a terminal:

Before running `npm install`, there're some environment preparations needed.

| Command | Action |
| :-------------------- | :-----------------------------------------------|
| `nvm use 18` | Use node version >= 18 |
| `brew install pkg-config cairo pango libpng jpeg giflib librsvg`| To install compatible node-canvas versions for Apple Silicon Chip laptop|

After these 2 steps are done, you can start with:

| Command | Action |
| :-------------------- | :-----------------------------------------------|
| `npm install` | Installs dependencies |
Expand Down
Binary file added docs/deep-learning.pdf
Binary file not shown.
Binary file added docs/lbdl.pdf
Binary file not shown.
Binary file added docs/sample.pdf
Binary file not shown.
4 changes: 3 additions & 1 deletion next.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@
const nextConfig = {
reactStrictMode: true,
swcMinify: true,
transpilePackages: ['@douyinfe/semi-ui', '@douyinfe/semi-icons', '@douyinfe/semi-illustrations'],
webpack(config) {
config.experiments = { ...config.experiments, topLevelAwait: true };
// config.experiments = { ...config.experiments, topLevelAwait: true };
config.externals.push({ sharp: 'commonjs sharp', canvas: 'commonjs canvas' });
return config;
},
};
Expand Down
3,275 changes: 1,969 additions & 1,306 deletions package-lock.json

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
"prepare:data": "tsx -r dotenv/config ./src/scripts/pinecone-prepare-docs.ts"
},
"dependencies": {
"@douyinfe/semi-icons": "^2.47.1",
"@douyinfe/semi-ui": "^2.47.1",
"@pinecone-database/pinecone": "^0.1.6",
"@radix-ui/react-accordion": "^1.1.2",
"@radix-ui/react-dropdown-menu": "^2.0.5",
Expand All @@ -20,12 +22,14 @@
"@types/node": "20.4.9",
"@types/react": "18.2.19",
"@types/react-dom": "18.2.7",
"ahooks": "^3.7.8",
"autoprefixer": "10.4.14",
"class-variance-authority": "^0.7.0",
"clsx": "^2.0.0",
"dotenv": "^16.3.1",
"eslint": "8.46.0",
"eslint-config-next": "13.4.13",
"jsdom": "^23.0.0",
"langchain": "^0.0.126",
"lucide-react": "^0.265.0",
"next": "13.4.12",
Expand All @@ -35,6 +39,7 @@
"react": "18.2.0",
"react-dom": "18.2.0",
"react-markdown": "^8.0.7",
"react-pdf-highlighter": "^6.1.0",
"react-wrap-balancer": "^1.0.0",
"tailwind-merge": "^1.14.0",
"tailwindcss": "3.3.3",
Expand All @@ -43,6 +48,7 @@
"zod": "^3.21.4"
},
"devDependencies": {
"@types/lodash": "^4.14.202",
"@types/pdf-parse": "^1.1.1",
"tsx": "^3.12.7"
}
Expand Down
267 changes: 267 additions & 0 deletions src/app/PdfDisplayer.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
"use client";

import React, { Component } from "react";

import {
PdfLoader,
AreaHighlight,
Popup
} from "react-pdf-highlighter";
import { Highlight } from "./components/Highlight";
import Tip from "./components/Tip";

// import type { IHighlight, NewHighlight } from "react-pdf-highlighter";

import { Sidebar } from "./Sidebar";
import { Spinner } from "./Spinner";
import { testHighlights as _testHighlights } from "./test-highlights";

import "./style/App.css";
import { PdfHighlighter } from "./components/PdfHighlighter";
import { PdfContext } from "./page";
import { IHighlight, NewHighlight } from "./types/types";

const testHighlights: Record<string, Array<IHighlight>> = _testHighlights;

interface State {
url: string;
}

const getNextId = () => String(Math.random()).slice(2);

const parseIdFromHash = () =>
document.location.hash.slice("#highlight-".length);

export const resetHash = () => {
document.location.hash = "";
};

const HighlightPopup = ({
comment,
}: {
comment: { text: string; emoji: string };
}) =>
comment.text ? (
<div className="Highlight__popup">
{comment.emoji} {comment.text}
</div>
) : null;

// https://arxiv.org/pdf/1708.08021.pdf
const PRIMARY_PDF_URL = "https://arxiv.org/pdf/1708.08021.pdf";
// const PRIMARY_PDF_URL = "file:///Users/bytedance/pdf.js/web/compressed.tracemonkey-pldi-09.pdf";
const SECONDARY_PDF_URL = "https://arxiv.org/pdf/1604.02480.pdf";

const searchParams = new URLSearchParams(document.location.search);

const initialUrl = searchParams.get("url") || PRIMARY_PDF_URL;

class PdfDisplayer extends Component<{
highlights: IHighlight[],
setHighlights: React.Dispatch<React.SetStateAction<IHighlight[]>>,
setSelectedHighlight: React.Dispatch<React.SetStateAction<IHighlight | undefined>>,
addHighlight?: ((highlight: NewHighlight) => void) | undefined,
setSummary?: React.Dispatch<React.SetStateAction<string>>;
isAIBusy: boolean,
setIsAIBusy?: React.Dispatch<React.SetStateAction<boolean>>;
}, State> {
state = {
url: initialUrl,
};

deleteHighlight = (id: string) => {
const highlightsCopy = [...this.props.highlights];
this.props.setHighlights(highlightsCopy.filter(i => i.id !== id));
}

handleOpenFile = async (file: File) => {
const url = URL.createObjectURL(file);
this.setState({
url: url,
});
const key = file.name.replace(/[^a-zA-Z0-9]/g, '').toLowerCase();
const formData = new FormData();
formData.append('file', file);
formData.append('key', key);
try {
this.props.setIsAIBusy?.(true);
const response = await fetch("/api/upload", {
method: "POST",
body: formData,
});
const reader = response?.body?.getReader();
let streamingSummary = "";
let tokensEnded = false;
while (true) {
const { done, value } = (await reader?.read()) || {};
if (done) {
break;
}
const text = new TextDecoder().decode(value);
if (text.includes("tokens-ended") && !tokensEnded) {
tokensEnded = true;
let texts = text.split("tokens-ended");
if (texts.length > 1) {
streamingSummary = streamingSummary + texts[0];
}
} else {
streamingSummary = streamingSummary + text;
}
}
console.log("streaming summary", streamingSummary);
this.props.setSummary?.(streamingSummary);
} catch (err) {
console.log(err);
} finally {
this.props.setIsAIBusy?.(false);
}
};

scrollViewerTo = (highlight: any) => {};

scrollToHighlightFromHash = () => {
const highlight = this.getHighlightById(parseIdFromHash());

if (highlight) {
this.scrollViewerTo(highlight);
}
};

componentDidMount() {
window.addEventListener(
"hashchange",
this.scrollToHighlightFromHash,
false
);
}

getHighlightById(id: string) {
const { highlights } = this.props;

return highlights.find((highlight) => highlight.id === id);
}

updateHighlight(highlightId: string, position: Object, content: Object) {
console.log("Updating highlight", highlightId, position, content);

this.props.setHighlights(this.props.highlights.map((h) => {
const {
id,
position: originalPosition,
content: originalContent,
...rest
} = h;
return id === highlightId
? {
id,
position: { ...originalPosition, ...position },
content: { ...originalContent, ...content },
...rest,
}
: h;
}));
}

render() {
const { url } = this.state;
const { highlights, setHighlights, setSelectedHighlight, setSummary } = this.props;

return (
<div className="App" style={{ display: "flex", height: "100%" }}>
<Sidebar
highlights={highlights.filter(h => h.isSaved) || []}
deleteHighlight={this.deleteHighlight}
onFileOpen={this.handleOpenFile}
/>
<div
style={{
height: "100%",
width: '60vw',
position: "relative",
}}
>
<PdfLoader url={url} beforeLoad={<Spinner />}>
{(pdfDocument) => (
<PdfHighlighter
pdfDocument={pdfDocument}
enableAreaSelection={(event) => event.altKey}
onScrollChange={resetHash}
pdfScaleValue="auto"
scrollRef={(scrollTo) => {
this.scrollViewerTo = scrollTo;

this.scrollToHighlightFromHash();
}}
onSelectionFinished={(
position,
content,
hideTipAndSelection,
transformSelection
) => (
<Tip
onOpen={transformSelection}
onConfirm={(comment) => {
const tempHighlight = { content, position, comment };
this.props.addHighlight?.(tempHighlight);
hideTipAndSelection();
}}
/>
)}
highlightTransform={(
highlight,
index,
setTip,
hideTip,
viewportToScaled,
screenshot,
isScrolledTo
) => {
const isTextHighlight = !Boolean(
highlight.content && highlight.content.image
);

const component = isTextHighlight ? (
<Highlight
highlight={highlight}
isScrolledTo={isScrolledTo}
position={highlight.position}
comment={highlight.comment}
/>
) : (
<AreaHighlight
isScrolledTo={isScrolledTo}
highlight={highlight}
onChange={(boundingRect) => {
this.updateHighlight(
highlight.id,
{ boundingRect: viewportToScaled(boundingRect) },
{ image: screenshot(boundingRect) }
);
}}
/>
);

return (
<Popup
popupContent={<HighlightPopup {...highlight} />}
onMouseOver={(popupContent) =>
setTip(highlight, (highlight) => popupContent)
}
onMouseOut={hideTip}
key={index}
>
{component}
</Popup>
);
}}
highlights={highlights}
/>
)}
</PdfLoader>
</div>
</div>
);
}
}

export default PdfDisplayer;
Loading