-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTODO
1219 lines (1211 loc) · 57.4 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
- Bug: On the dev box, at some point, possibly when I imported an OPML
file, a bunch of entries in the `counts` table were created badly:
they all have num_read==9. In those cases where there aren't 9 items
total, this leads to a negative unread count.
Looks like this was due to the update_item trigger being
truncated: it was missing the "WHERE" line at the end. Does this
happen in real life?
- Bug?: The feed URL is being updated where it arguably shouldn't be:
When update.php runs, it says that a feed has an HTTP redirect,
e.g.,
http://foo.com/rss -> http://bar.org/rss
But then a feed can have a link-to-self, e.g.,
<channel>
<atom:link href="http://foo.com/rss" rel="self" type="application/rss+xml"/>
(Le Monde does this). This is an Atom feed that says, "here is the
preferred URL at which to get this feed." So this URL is official,
and the HTTP redirect is intentional, if the admin knows what
they're doing.
Or perhaps the admin updated the feed URL and forgot to update
the <atom:link>. So I'm not sure what to do about this.
- Bug: When subscribing to a feed, the initial display doesn't work:
subscribe to a new feed, which gets ID 999. This redirects you to
.../view.php?id=999.
init() gets 'feeds' from cache, then calls set_feed_fields().
set_feed_fields() tries to get its information from 'feeds',
but that's a cached version that doesn't have the new feed yet.
- Bug: When I changed the feed URL for The Atheist Experience, the
posts appear out of order: last_update is when it was refreshed, but
the pub_date is when the post was published:
mysql> select pub_date, last_update from items where !is_read and feed_id=50;
+---------------------+---------------------+
| pub_date | last_update |
+---------------------+---------------------+
| 2012-02-23 19:02:56 | 2012-03-07 09:14:01 |
| 2012-02-24 09:34:12 | 2012-03-07 09:14:01 |
| 2012-02-25 18:29:40 | 2012-03-07 09:14:01 |
| 2012-02-26 20:48:49 | 2012-03-07 09:14:01 |
| 2012-02-27 19:05:02 | 2012-03-07 09:14:01 |
| 2012-02-28 09:29:09 | 2012-03-07 09:14:01 |
| 2012-02-28 18:12:45 | 2012-03-07 09:14:01 |
| 2012-03-02 14:48:49 | 2012-03-07 09:14:01 |
| 2012-03-04 18:23:50 | 2012-03-07 09:14:01 |
| 2012-03-05 17:51:24 | 2012-03-07 09:14:01 |
+---------------------+---------------------+
In this case, last_update should default to pub_date. Or something.
- Bug?: Automatic RSS redirect: This got broken by the "Stop SOPA"
protest, when several sites redirected their RSS feeds to an
anti-SOPA page. After that, several feeds got stuck on that page, so
they didn't update.
Perhaps the best thing to do would be to not redirect
automatically, but to generate a fake article in feed 0 saying where
the feed got redirected to. Perhaps include a link to 'editfeed.php'
with the new URL.
Update editfeed.php to allow supplying a new parameter.
What about when a feed URL is good, but includes a (wanted)
redirect? Ought to have some list of redirects to ignore.
- Bug?: There are some old articles in the database. Why aren't they
being cleaned out properly?
db_update_feed() only cleans out whichever feed it was called
with. There's nothing to clean out all feeds, including inactive
ones.
Arguably this isn't that much of a problem with inactive
feeds, since the number of articles won't grow.
But some active feeds aren't being cleaned out either.
Apparently this is because the URLs are dead.
- Bug: Post titles (and other things) sometimes have "&" instead of
"&".
Need to figure out whether we want to store things as "text"
or HTML in the database.
Storing as HTML means we get to preserve markup, e.g., if a
title uses an italicized phrase or a weird entity. For this to work,
need to go through the feed-parsing code and make sure that if a
title is text, we convert to HTML; if it's HTML, we merely clean it
up.
Assume that any text-type text (titles, summaries, contents)
are HTML, and therefore need to be cleaned up.
Email addresses are text, but should probably be sanitized.
Author names... not sure. There may be non-ASCII characters,
and different people may deal with this differently (UTF-8 vs. HTML
entities). Perhaps safest to assume that they're HTML and clean up
(if possible, only do entity translation, and throw out any markup).
- REST: $rreq is used so often in the controllers, it probably makes
sense to move it into a RESTController parent class.
- Create this parent class.
- Move $rreq into it
- See which functions make sense as standalones
- REST: Would it make sense to take some of the methods out of REST
controller classes, and make them global functions?
Yes if that makes them easier to call from exernal programs,
or if they don't really use OO-ness.
- Bug: keybindings.js doesn't handle non-letter bindings well: "?"
gets turned into Spanish upside-down question mark by toUpperCase().
Need a notation to bind Esc.
- IMAP interface: would it be possible to write a Dovecot plugin to
present an IMAP interface to the database? That way, you could use
any mail reader, and the database would be kept in sync.
- Split up PHP scripts into HTML (presentation) vs. REST (do things,
return results in JSON).
Now that the skins are gone, it should be possible to just use
HTML outlines for initial presentation, and do everything useful
with REST/JSON.
The initial-presentation scripts can just be turned into
static HTML pages. Some of them still have logic in them, and this
will have to be split up.
Perhaps put HTML files in ./htdocs, and REST scripts in
./htdocs/rest. Update URLs accordingly.
- Slow sync vs. fast sync:
Have the server send a timestamp (down to the millisecond, I'd
say) of the last update. The client will use this when requesting
the next update.
Initial (slow sync):
- Client requests sync of feed $f, around item $i, last_update 0
(epoch),
- Server sorts feed $f by time, returns a list of the items around
$i, and the current timestamp.
Subsequent fast sync:
- Client requests sync of feed $f, around item $i, last_update
12345 (whatever the server returned in the previous sync).
- Server sorts feed $f, gets list of items around $i, but only sends
the ones that have been updated since 12345.
Problems:
We need a cap, e.g., 100 items per update. How to indicate
continuation? Set a flag in the response, or set the last-update
timestamp to the most-recently-sent item? Or perhaps include the
timestamp and ID of the last-sent item (need to consider the
possibility that 1000 items have been updated at once with the same
timestamp).
What about if I'm reading (and syncing) feed 1, then start
reading feed 2, then feed "all"? The sync stamps are different.
Perhaps the client can send a hash of timestamps, saying "I have
feed 1 up to time_0, feed 2 up to time_2, "all" up to time_3"?
- Groups: we can precompute group membership: just add an "implied"
boolean field to the group table. "implied == false" means that the
user set group membership explicitly; "implied == true" means that
it's implied by the explicit relationships.
Thus, if we have
group All, gid -1
group News, gid -2
group Politics, gid -3
feed Electoral Vote, id 123
and we add Electoral Vote to group Politics, Politics to News, and
News to All, we'll have explicit relationships:
Electoral Vote (123) in Politics (-3)
Politics (-3) in News (-2)
News (-2) in All (-1)
from which we can derive implicit memberships:
Electoral Vote (123) in News (-2)
Electoral Vote (123) in All (-1)
Politics (-3) in All (-1)
- Use OAuth for authentication?
http://oauth.net/2/
Except it's apparentlly a mess, hard to use, hard to know when it's
secure.
- Detecting online/offline status: discussion at
http://stackoverflow.com/questions/3181080/how-to-detect-online-offline-event-cross-browser
There's a library for that:
https://github.com/HubSpot/offline
- Look for better fonts?
http://www.fontsquirrel.com/
League of Movable Type
- CSS text-overflow property:
http://www.w3schools.com/cssref/css3_pr_text-overflow.asp
text-overflow: "clip" | "ellipsis" | <string>;
When text doesn't fit into the alloted box, can use an ellipsis for
the text that doesn't fit.
- "use strict": apparently modern JavaScript implementations grok
"use strict";
https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Functions_and_function_scope/Strict_mode
Put '"use strict";' at the top of the script, at the global
level, to turn on strictness for an entire script. Or after the
opening brace to turn it on for a function.
Tue Oct 23 09:06:52 2012: But this doesn't seem to change
anything. Perhaps because javascript.options.strict is already
turned on in about:config.
- Variables must be declared; using an undeclared variable is
an error.
- Assignments that would normally fail silently throw an
error.
- Deleting undeletable properties throws an error.
- Can't have duplicate properties in object literal:
var o = { foo: 1, foo:2 };
works normally, but fails under strict.
- Require unique argument names to functions.
- Forbids octal syntax.
Apparently JSLint prefers the syntax
(function() {
"use strict";
... body of script ...
})();
- Getters and setters:
https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Working_with_Objects?redirectlocale=en-US&redirectslug=Core_JavaScript_1.5_Guide%2FWorking_with_Objects#Defining_getters_and_setters
- Profiling: according to
http://fuelyourcoding.com/webkits-javascript-profiler-explained/
can turn Webkit (Chrome/Safari) profiling on and off with
console.profile("my profile name");
<code to profile>
console.profileEnd("my profile name");
- Possible feature: "save for later": Grab a random web page and turn
it into a post in an artificial feed.
Could do this with a bookmarklet that grabs the current
location and submits that to a script.
Need to analyze the content of the page somehow. Perhaps can
check to see whether it contains "<article>...</article>" and if so,
use that. Otherwise... not sure.
- Le Monde titles: apparently their RSS feed uses UTF-8 for nearly
everything, but Windows-1252 for _some_ titles. Can't fix. Won't
fix.
Or maybe they use Windows-1252 for everything, and just claim
that it's UTF-8.
- Mutation events (like "DOMNodeInserted" in js/PatEvent.js) are
apparently now deprecated in favor of the DOM MutationObserver:
https://hacks.mozilla.org/2012/05/dom-mutationobserver-reacting-to-dom-changes-without-killing-browser-performance/
- Events to look up:
https://developer.mozilla.org/en-US/docs/tag/events
devicelight: ambient light sensor:
https://developer.mozilla.org/en-US/docs/DOM/DeviceLightEvent
Could use this to switch to dark theme at night.
But apparently not supported by Android or Mobile Safari.
devicemotion: accelerometer
deviceorientation: device has rotated (landscape <-> portrait)
deviceproximity: proximity of physical object (?)
drag: when element or text selection being dragged
dragend: when drag operation has ended
dragenter:
dragleave:
dragover:
dragstart:
drop:
durationchange:
emptied:
ended:
focus:
hashchange: when hash part of URL in URL bar changes.
input:
invalid:
loadeddata:
loadedmetadata:
loadstart:
message:
mozfullscreenchange:
mozfullscreenerror:
mozpointerlockchange:
mozpointerlockerror:
offline:
online:
pagehide:
pageshow:
paste:
pause:
play:
playing:
popstate:
progress:
ratechange:
reset:
resize:
scroll:
seeked:
seeking:
select:
show:
stalled:
submit:
suspend:
timeupdate:
unload:
userproximity:
volumechange:
waiting:
- document.querySelector() and .querySelectorAll() look massively
useful:
http://www.javascriptkit.com/dhtmltutors/css_selectors_api.shtml
- location.replace(url): fills in url in the URL bar, and immediately
redirects to that location.
However, if the current and new URLs differ only in the bit
after "#", it doesn't reload anything. So this is a way to change
the current URL in the URL bar, without having to reload the page.
This works in Chrome, but Firefox throws an
apparently-harmless exception.
- <element>.getBoundingClientRect():
https://developer.mozilla.org/en-US/docs/DOM/element.getBoundingClientRect
returns a struct with the top, bottom, left, right of the element,
relative to the window, I think. Basically, this means that:
el = <some element>
box = el.getBoundingClientRect()
If box.top < 0, the top of the element is above the viewport.
If box.bottom < 0, the bottom of the element is above the viewport.
- Probably time to switch to a real JS template library. Perhaps
http://handlebarsjs.com/
unless there's something better. JQuery apparently has some good
stuff, including a template plugin.
- SED_REPLACEMENTS is bulky and awkward. It'd probably be best to
generate a 'replace.sed' script (or something) in the top directory.
It can then be used by every other Makefile.
- localStorage: performance sucks under Firefox. See
http://hacks.mozilla.org/2012/03/there-is-no-simple-solution-for-local-storage/
for possible replacements: IndexedDB (Chrome, Firefox) or WebSQL
(for mobile Safari, Android).
WebSQL is allegedly deprecated by w3. But mobile Safari and
Android seem to run fine. So perhaps just use IndexedDB on Firefox,
and retain localStorage on the others?
See also
https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API
- Problem with current sync (as of v988): say you have a browser
viewing only a feed like Zero Punctuation, which is only updated
once a week.
cache.get_marked() will get all marked items newer than the
oldest item in the cache. In this case, that means it'll have to
plow through hundreds or thousands of irrelevant updates with only
one or two items that are actually in cache.
Perhaps a better approach would be something more like PalmOS
syncing: the client says what it has in cache, and the server tells
it what it ought to have. Preferably this should come in both slow
and fast sync varieties: slow to get all changes, fast to, well, be
fast: only update those things that have changed since a given time.
- Sync: Probably ought to split up "updates" into multiple phases:
- Send articles that have been marked read/unread
- Get articles that have been marked read/unread since time T
- Get new articles
Also, need to store time of last update in localStorage.
Probably not worth having a feed ID for getting list of marked
items.
- List of Mobile Safari events:
http://developer.apple.com/library/safari/#documentation/AppleApplications/Reference/SafariWebContent/HandlingEvents/HandlingEvents.html#//apple_ref/doc/uid/TP40006511-SW5
(Select "Supported Events" if that doesn't work)
pageshow/pagehide:
https://developer.mozilla.org/En/Using_Firefox_1.5_caching#pageshow_event
Looks like they're similar to load/unload events, except they fire
even when a page is loaded from cache.
- WebStorage spec (http://www.w3.org/TR/webstorage/) says that when a
value in localStorage is modified, it triggers an event in other
Windows that have access to that database. This could be useful.
window.addEventListener("storage",
function(ev) {
console.log("Got a storage event: %s was %s, now %s",
ev.key, ev.oldValue, ev.newValue);
},
false);
- Perhaps add Xdebug extension for PHP: profiler.
- Add a /* SCHEMA */ comment everywhere where the code needs to be
modified when the schema changes.
- WaPo articles have <wp:web-link>, <wp:mobile-link>, <wp:share-link>,
<wp:search-link>.
<wp:mobile-link> is not, unfortunately, a link to a
mobile-friendly web page. It does, however, contain
<ece:field name="body">, which contains the full text of the
article, or at least a longer excerpt than the normal RSS feed. This
could conceivably be used to get a better 'content' field.
For efficiency, ought to fetch this only when an item is being
added, or when it has been modified since first publication.
- Perhaps add another phase to collecting articles: examining
individual posts. Plugins could go here.
The idea is: update_feed() fetches a given feed and parses it
into articles and such. Then, when that's done, feed each (new?)
article to a plugin that can rewrite the article.
This plugin can do things like:
- For GoComics feeds, scrape the article page and retrieve the
URL for the image (look for a <div> with class "comic", or whatever
it is). Rewrite item.summary to include the image.
- For Washington Post articles, scrape the page and get the
keywords from the HTML header:
<meta name="keywords" content="virginia,democrats,education,state budget,legislators"/>
It can use this to tag an article by keywords, or even auto-kill the
sports articles.
The latter can also be done by looking at <pheedo:origLink>,
to find articles at http://www.washingtonpost.com/sports/...
- Possible improvement: move static content to a different domain, one
that doesn't requie cookies.
- Perhaps a way to get row number, and get the N rows around a given row:
http://jimlife.wordpress.com/2008/09/09/displaying-row-number-rownum-in-mysql/
- Perhaps install PHP pcntl module? Includes fork(), exec(). This
could be useful for a long-running tool on the server that sends
updates on article status to the client(s).
- Key bindings: This is way more complicated than it ought to be:
It looks as though there's keyboard focus, which has nothing
to do with where the mouse it (i.e., HTML/DOM has no concept of
focus-follows-mouse). Rather, the key event goes to whichever
element has keyboard focus (which is originally the body, I guess),
and (apparently) bubbles up from there.
In order to have a completely generic "bind any keymap to any
element" setup with focus-follows-mouse, we'd need to keep track of
every element that has keydown events associated with it: when the
mouse enters, call element.focus() on it, and when the mouse leaves,
call element.blur(). (Also apparently need to
setAttribute('tabindex', 0) for some reason.)
Since (I think) we only want to bind events to items and to
the window as a whole, the best solution seems to be to keep track
_enter and _exit evenst (which we do) to see which item is the
current one, and have a window-level handler "keydown" handler to
invoke the proper function on the current item.
- Script to provide an RSS/Atom feed of the stuff in the database, for
Calibre.
Presumably should return the last 24 hours' worth of articles.
Perhaps should support individual feeds, or groups (once those
exist).
What should the limiting factor(s) be? Currently there can be
> 500 items in a day.
- Template.expand should return a DOM object, not a string.
- Kindle: make checkboxes bigger.
- Integrate keybinding stuff into PatEvent.
- Move event handlers over to bind_event().
Fix update_feed(id) to work with bind_event (i.e., take only
one argument).
I think this means fixing bind_event to store a keytab with
each
- What's the best way to handle XKCD mouseover text on mobile devices?
Perhaps an HTMLPurifier plugin that finds <img alt="foo"/> and puts
the alt-text under the image, as a caption?
Or perhaps put a button next to the image, which expands into
a box with the alt-text?
- Is it possible to add a "share" button to articles, for sharing on
various social networks? (G+, Facebook, Twitter, etc.)
For Twitter: build a button at
http://twitter.com/about/resources/tweetbutton
For G+: this might do the trick:
http://www.google.com/intl/en/webmasters/+1/button/index.html
Facebook: this, or something related?:
http://developers.facebook.com/docs/reference/plugins/send/
- markitem code in view.jsh is old and crufty. Fix it.
Use AJAX call from xhr.
- Maybe ought to store a feed's language, so that we can figure out
how to sort its name correctly in the main feed display (e.g.,
"Le Monde" (fr) => "MONDE LE", and "El Mundo" (es) => "MUNDO EL").
This can be guessed from the <language> tag in RSS (Atom
doesn't seem to specify), but Le Monde, at least, doesn't bother to
say that it's in French.
Make this a feed option.
- Spinny thing when updating feeds:
Create a single image with all of the various states a feed can be
in: normal, updating, error. Then use CSS to set a class, which in
turn shifts the background image up or down to display the
appropriate icon.
Perhaps have it use the "number of unread articles" column, so
as to take up less space on the iPhone.
- Dates in the database are hopelessly fucked. I'm not sure what I was
trying to do anymore, and when you download a lot of articles from a
new feed, they show up in some bizarre order.
I think the things I care about are:
- When was this article put out on the net?
- (Maybe) When did the author last revise this article?
- How long has it been since anything was posted to this feed?
- When's the last time I successfully got the contents of this
feed?
- mtime: when did I mark this article as read/unread?
lib/database.inc: feeds.last_update should be computed from
items.pub_date. Ignore anything it says in <rss><channel><pubDate>.
We could _almost_ just compute feeds.last_update on the fly,
as the latest item.pub_date, but for one thing: if all of the items
get expired and deleted, we still want to know that the last item
was posted on such-and-such date.
- Feeds page:
Instead of "Tools" and "Details" cluttering up the page, just
list the #unread, title, and a ">>" link that pops up a details/edit
page.
- Feeds page drop-down menus: use an onclick to toggle menu on and
off, for mobile devices and such.
- Editing feed details: instead of (or in addition to) a separate
"edit feed" page, have an "edit" button on the page for a given
feed. When clicked, highlight the parts that can be edited: title
(edit the nickname).
Actually, the nickname is the only visible element that it
makes sense to edit this way. For the rest, there could be a form
overlaid over the feed.
- To make things really fancy, could have live updating of article
status on multiple devices: Have the client open a persistent
connection to the server (basically, a PHP script that never exits).
Over this channel, the server announces changes to items. So if
client A marks item 12345 as read, this information gets broadcast
to all connected clients, and they can update their displays
accordingly.
- JavaScript with namespaces
Things like the Google +1 button use a namespace: <g:plusone> or
whatever it is. This seems fairly useful.
What would be useful would be the ability to get a list of all
tags in a given namespace: if I have a bunch of <foo:a>, <foo:b>,
<foo:c>, ... tags sprinkled through a document, it'd be nice to get
them all without having the parse the DOM.
Looks like the way to do this is
getElementsByTagNameNS("http://namesp.ace/URL", "*");
See
http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-A6C9094
- Publisher guidelines from readability.com:
http://www.readability.com/publishers/guidelines/
This should make things show up better in Safari Reader.
- <article>
- class="hentry"
Firefox doesn't grok <article> yet. This breaks CSS.
Firefox 3 apparently groks <article>
- class="entry-title" for post title.
- <time class="updated" datetime="2011-10-24 10:11:12-0400" pubdate>whenever</time>
"updated" for hAtom; "pubdate" for Reader.
- <p class="byline author vcard"><span class="fn">John Q. Author</span></p>
Who wrote the article.
- <div class="entry-content">
Content of the post.
- <div class="entry-content-asset">...</div>
Mark images and such that are directly related to the post.
- Apparently iOS JS can use accelerometer:
http://developer.apple.com/library/safari/#documentation/SafariDOMAdditions/Reference/DeviceMotionEventClassRef/DeviceMotionEvent/DeviceMotionEvent.html#//apple_ref/doc/uid/TP40010525
Perhaps use "shake" to mean "refresh"?
Also rotation:
http://developer.apple.com/library/safari/#documentation/SafariDOMAdditions/Reference/DeviceOrientationEventClassRef/DeviceOrientationEvent/DeviceOrientationEvent.html#//apple_ref/doc/uid/TP40010526
- Would hAtom or hNews be useful?
http://microformats.org/wiki/hAtom
http://microformats.org/wiki/hnews
Might be able to mark things up for Safari reader and/or other
readers. Perhaps might even be able to reuse some class names.
- Can HTMLPurifier add a base URL to relative links? That is, if it
sees
<a href="/foo">
can it be told to turn that into
<a href="http://www.somesite.com/foo">
? Ditto for images and perhaps others.
See
http://htmlpurifier.org/live/configdoc/plain.html#URI.MakeAbsolute
Presumably ought to pass base URL in as a second parameter.
- Minify and compress supporting files.
http://code.google.com/p/minify/
Can run this as:
php-cgi /path/to/minify/min/index.php f=/path/to/foo.js > foo.js.min
Also compress it.
Question: does minifying make compression less efficient? Or
does it remove enough useless fluff that the less-efficient
compression is still worth it?
minify adds headers. Is this desirable?
- Minify CSS, JS: there are various minification tools out there that
will strip comments, extraneous whitespace, etc. from CSS or JS
files.
Like compressing, this should be done at install time.
- Split up distribution.
There should really be two distros: one for hackers who want
the source, and one for people who want to install it.
The hacker distro is pretty easy: it's just all of the source
files and such, a copy of ~/proj/newsbite without the .git noise.
For the installer distro, ought to have a number of
directories, for all the various directories where things can be
installed:
htdocs (must go in DocumentRoot)
lib (can go outside of DocumentRoot)
plugins
...
Ideally, should have an external config file that says where to
install everything, one that users can carry forward when they
upgrade. Also, the installer makefile shouldn't have any
dependendies on git (neither should the hacker one, really).
Also, in the installer distro, the CSS and JS files should be
minified and compressed.
In PHP, can use 'if (extension_loaded("mysql"))' to find out
whether a given module is loaded. Can use
'$module_list = apache_get_modules()' to get list of available
Apache modules. Can use this to suggest a .htaccess to the user.
- dKos text looks funny: first paragraph is indented less than
subsequent ones. Is this something that can be fixed?
It has
text text
<p>More text</p>
<p>yet more text</p>
IOW non-paragraph text isn't being indented. Should HTMLPurify clean
this?
This happens anywhere there are paragraphs, including inside
<description> and inside <blockquote>
- Function to add an article to a phony feed. This can be used to give
status reports on update, and stuff like that.
Can use internal-rss.php or some such to generate a bogus RSS
feed, or something. Or even flag them in the database.
- Plugin API: perhaps should have a standard return status indicating
what happened, e.g.:
0 - Didn't run (conditions not met)
1 - Ran, but no change
2 - Ran, made a change
Or something like that. The idea being that we can at some point
collect statistics and see which plugins run when. In particular, we
want to see when a plugin stops working because the feed changed the
format of its web bugs or whatever.
- Standalone app: put up a spinner or something when updating cache.
See downspout:/folks/htdocs/app/app.js
Could perhaps use u256d-u2570 (probably not, though), or u2596-u259f,
u25d0-u25d3, u25d4-u25d7, u25dc-u25df, u25f4-u25f7
- Smart ordering: The system potentially has access to a lot of
information about reading habits:
- how often a given feed gets updated
- length of summary and content
- whether there are images in posts
- when an article is visible in the window (maybe not on mobile devices?)
- whether article is fully visible
- whether the entire article was ever shown
- which button was used to mark article as read (top or bottom)
- when an article was marked read:
- time since the last article was marked
- time of day when the article was marked
- day of week when the article was marked
- location where the article was marked
- keywords in the article
- title
- summary
- content
- other meta keywords
All of this could potentially be used as input to a
machine-learning system to figure out what I want to read, in what
order, and let articles bubble to the top accordingly (e.g., it'd be
nice if it could figure out that I want to catch up on comics; let
rarely-updated friend blogs bubble up to the top, and so on).
Problem is figuring out how to tell whether it's doing a good
job or not, i.e., whether I've actually read an article or not.
If I mark a long article as read, one second after marking the
previous one, then presumably that means that I didn't read it. But
how to tell whether I've read an article?
- Perhaps put git rev in authentication cookie? That way, when the
back-end is updated, can run the update script or whatever needs to
be done.
Or maybe just set another cookie with the version, and check
that in common.inc.
- Differences between various versions:
* Article title links
- Desktop: regular <a>. Middle-click to open in new tab
- iPhone/iPad, in Safari: open in new window
- iPhone/iPad, standalone: Open in an iframe or something.
Don't launch Safari.
* manifest:
- Desktop: don't include. Or not yet
- iPhone: include iPhone-specific files
- iPad: include iPad-specific files
- In the mobile app manifest, is the network whitelist really
mandatory? Does it take relative URLs? If not, then add this to the
INSTALL instructions. Either that, or generate the manifest during
installation.
- Mobile app: is it possible to interrupt an update?
swapCache() updates the cache (switches to the most recent
application cache). However, it doesn't reload resources from the
new cache.
I don't think it's possible to abort an update. But it's at
least possible to show a progress bar. When app changes, cache
status goes from CHECKING to DOWNLOADING (multiple times), and
finally UPDATEREADY (or, presumably, NOUPDATE or IDLE).
- For 'update', since the animation is pretty, could have a
"multi-line JSON" output format. Is it worth defining another output
type for this ("jsonm"?)? Or should the client just assume that if
the text has newlines in it, that it's a series of updates?
Probably the former: it'd be nice to know whether to use a
XMLHttpRequest object that needs to follow state 3 or not.
- Which database fields have limited size? HTMLPurify them before
displaying.
* Feed title (assume it's text)
* Feed subtitle (assume it's text)
- (Feed nickname)
- (Site URL)
- (Feed URL)
* Feed description
- (URL to feed image)
- (Item URL)
- Item title
- Item summary
- Item content
- Item author
- (Item category)
- (Item content URL)
- (Item comment RSS URL)
- (Item GUID)
Hack to see what I've missed: have HTMLpurify put a distinctive
marker around the purified parts, like "foo" => "@[foo]@".
- What does HTMLPurifier remove?
- dKos talk threads (<ul>?)
- "About" section in Strange Maps
It's too long for the database, so the end gets truncated,
along with various close-tags.
Quick hack: invoke clean-html when displaying, rather than
when updating.
- Functions and scripts that take a feed ID should also take a group.
In particular, "all" should be considered a group, not a special
feed ID (though it can be considered a special group).
- Add "$Rev$" or something to some file, so we can tell which revision
we should be at. Compare this to the database, and see if we need to
upgrade.
- Add hook for links to mobile versions of URLs (view.php).
Probably should add a subdirectory underneath "plugins".
- For error messages, and any other time when the app needs to talk to
the user: could create a synthetic feed. This could include error
messages from update.php, warnings that a feed has been redirected
(perhaps including a link to update the feed config), and anything
else.
Could perhaps use "internal:" as the URL for these feeds.
- CSS multicolumn stuff:
See http://www.zenelements.com/blog/css3-multiple-columns/
Fixed width:
.multicol {
/* outline: 1px solid red;*/
/* text-align: justify;*/
-moz-column-width: 15em;
-webkit-column-width: 15em;
-moz-column-gap: 2em;
-webkit-column-gap: 2em;
}
Fixed number of columns:
#my_CSS3_id {
text-align: justify;
-moz-column-count: 3;
-moz-column-gap: 1.5em;
-moz-column-rule: 1px solid #c4c8cc;
-webkit-column-count: 3;
-webkit-column-gap: 1.5em;
-webkit-column-rule: 1px solid #c4c8cc;
}
- Worker threads could be useful. However, iPad doesn't seem to
support them.
One way to handle this might to have synthetic events: with
worker threads, the worker generates an event when new data is
available, which calls a handler in the main thread.
In Mobile Safari (iPhone/iPad), we can have the
data-manipulation code generate synthetic events similar to the ones
the worker thread would. That would allow fewer differences between
threaded and non-threaded code.
See
http://www.davidflanagan.com/javascript5/display.php?n=17-8&f=17/DataEvent.js
- Could build an object to coordinate asynchronous activities:
var checkpoint = new waitForIt(cont_func, arg1, arg2, ...);
checkpoint.waitFor("this");
checkpoint.waitFor("that");
checkpoint.waitFor("theother");
says that once "this", "that", and "theother" have completed, it
should execute cont_func(arg1, arg2, ...).
Then start the various asynchronous activities, like loading
modules. When it's done, each one calls
checkpoint.doneWith("this");
checkpoint.doneWith("that");
checkpoint.doneWith("theother");
and once they've all checked in, checkpoint runs cont_func().
- Feed maintenance: feeds go stale, get deleted, etc. all the time.
It'd be really nice if the index page showed the ones with errors.
Check the HTTP status: if it's 4xx, then the feed is dead.
- Guideline: Apple recommends using 'delete' to free up memory and
help the garbage collector. Basically, any time you have
var my_foo = new Foo;
write
delete my_foo;
wherever you're done with it.
Also, use 'var', to make variables local (so they can be
destroyed when they go out of scope).
- High Performance JS book:
- Move <script> tags to the bottom of <body>: loading scripts blocks,
so this at least shows HTML to the user while the JS loads.
- Use <script defer> when possible.
What qualifies as "not modifying the DOM"?
- Loading JS-specific CSS, and loading additional scripts from JS: p.7
<script type="text/javascript">
var stylesheet = document.createElement("link");
stylesheet.rel = "stylesheet";
stylesheet.type = "text/css";
stylesheet.media = "all";
stylesheet.href="css/style-js.css";
document.getElementsByTagName("head")[0].appendChild(stylesheet);
</script>
This can be put at the end.
- Profiling and debugging should be per-page options. These can be
turned on on a per-page basis or something, and the requisite
scripts loaded dynamically, to speed things up in the normal case
(pp. 7-9).
- If JS gets complicated, could use an Emacs-like autoload scheme:
define an array of functions such that, if they're ever called,
load the requisite JS script, then call the original function.
Can use the same introspection techniques as the profiler.
- Building HTML with innerHTML is faster on all browsers except
WebKit, though the difference is far less significant than it used
to be.
- cloneNode method. Is this useful at all?
- The getElementsBy*() functions return collections. Accessing them
is expensive, because they're live. This includes getting the
length. So intead of
for (i = 0; i < collection.length; i++)
use
for (i = 0, len = collection.length; i < len; i++
- Repaints/reflows are expensive, but necessary for
collapsing/expanding articles.
Is it worth getting the height and Y-offset of each <div item>
in an onload callback, and caching the value in case the user
wants to collapse/expand something?
Use onload, since that's when all the images will have loaded,
and we know what the final look of the page is.
- Worker threads can be used for long-running functions without
affecting the UI: they spawn off a second thread and do stuff
there. However, they can't touch the DOM, and can only communicate
with the main thread through callbacks. See
https://developer.mozilla.org/en/Using_web_workers
- Set the expiration time on pages, so that they get cached. By
convention, caching for slightly less than a year means forever.
But also want to pick up new versions. A week, maybe?
- Articles with only content, no summary, should also be collapsible.
Set the max. height
- Does it make sense to have 'onclick="javascript:foo"'? Or can we
just have 'onclick="foo"'?
- Use addEventListener() instead of on*=script
- Bookmarklet to mark an article as read.
If I stumble across, say, a Bad Astronomy article in Twitter,
I don't want to read it again in the normal feed. It'd be nice to
mark it as read.
There's a stub function in database.inc: db_mark_url(), which
marks items as read that have a given URL.
One potential problem: what if I find out about a new post
from Twitter, before NewsBite has refreshed the feed? Then there
will be no item in the database with the given URL.
Perhaps the way to get around this is, if the post URL isn't
already in the database, add it to a list in another table. The next
time feeds are updated, check the list again, and remove them.
Problem is, what if the sequence is:
1) Mark http://pharyngula.org/some-post as read
2) No such post found; add it to the check-later list
3) User updates Bad Astronomy feed
4) Check the list again; http://pharyngula.org/some-post still
doesn't exist in the database.
- Search function: find words in feeds, titles, articles, etc.
- Create full-text index on summary, content. Perhaps title.
CREATE FULLTEXT INDEX tbl_index ON tbl (col, col...);
- Make updates smarter about updating when no network available.
On iPhone, don't contact the network if there's no network to contact.
- Ads: perhaps should generalize: look for
<a href="$some_ad_url">...</a>, optionally wrapped in <p>...</p>.
- WaPo duplicates: "Today's Highlights" can duplicate stories from
other feeds. Would be nice to detect these and delete duplicates.
Ditto for Google news
it might be useful to define a cluster of feeds: "WaPo
Highlights", "WaPo International", and "WaPo Top Stories" could be
grouped together, and duplicate items removed.
Sometimes the URLs are different (e.g., WaPo or NYT use an
"&src=" or "#foo" parameter to say which feed the user came from).
Need a way to normalize URLs.
Perhaps the best way to do this is to just allow the user to
define arbitrary groups, then attach properties/plugins/hooks to
groups, as well as to individual feeds.
- Sorting by many criteria
It'd be nice to sort all of the items by many criteria, e.g.,
comics come before news; comics are shown oldest-first, while news
items are shown newest-first; dKos articles by Hunter percolate to
the top; etc.
This is probably too hard to sort in MySQL, so presumably need
to do it in PHP. However, don't want to sort a massive data set on
the fly.
Might be possible to pre-sort in a cron job. But to avoid
sorting a massive data set, would it be possible to use some variant
of quicksort to first sort by time, then sort by "comics go before
news", etc.?
- Might be useful to add gestures, for portable devices. See
http://depts.washington.edu/aimgroup/proj/dollar/
On the N810, one can scroll the display by drawing on it. So
adding gesture recognition is likely to play havoc with that. Ditto
iPhone/iPad.
Perhaps hard-code straight-line templates, and implement
scrolling manually. (Swipe, on iPhone/iPad.)
That implementation is rotation-independent: "v" can be
drawn as ">" and still be recognized as "v". To fix this, can divide
space into quadrants (or octants) centered on centroid, and classify
shapes based on where the starting point is WRT the starting point.
This way, "v" would have the starting point in the NW quadrant,
while ">" would have it in the SW quadrant, so they'd be seen as
different.
This is particularly important because it would be desirable
to have straight lines at different angles (e.g., stroke up to go to
previous article, stroke down to go to next article), while the $1
gesture algorithm would see those as the same shape.
To have straight lines, it would be necessary to add the hacks
they suuggest in the paper: check to see whether one of the
dimensions is "too small".
The distance calculation has a square root. This seems
expensive and unnecessary, since we're only trying to find the
closest match. Might save some CPU cycles.
Which gestures would we want?
- Go to previous article
- Go to next article
- Mark as read and move to next article
- Mark as unread?
- Refresh list of articles
- Need tool to fix database schema.
ALTER TABLE items CHANGE COLUMN url url VARCHAR(255);
ALTER TABLE items CHANGE COLUMN comment_url comment_url VARCHAR(255);
ALTER TABLE items CHANGE COLUMN comment_rss comment_rss VARCHAR(255);
- Should be able to read both read and unread articles.
- Would be nice to have either "new" flag or a "saved for later" flag.
The latter is probably more useful. Similar to (Mobile?) Safari's
"Reading List" feature.
That way, can go through today's stuff fairly quickly, without
having to scroll through messages you've already seen.
Perhaps make this a pseudo-feed (for which need support for
pseudo-feeds). Add a "star INT" field to the database, perhaps?
Don't display starred items in normal view. Having "star" be an int
allows defining multiple star types (to mark articles in different
ways).
- Should there be a module (or something) for Twitter updates?
- There should really be a tool for testing plugins.
For removing ads and such: it'd be nice to plug the XML or the
article into a temp file, and
- Makefile:
Run tests:
Try to call various functions and make sure they return
correct results
php-cgi -c php.ini <script>
- Documentation for how the plugins for feed.inc work.
Texinfo? Add code to Makefile to generate HTML documentation.
- More fine-grained function to select items from a feed:
- One feed, or all feeds
- Time range: e.g., the last 24 hours
- Order: oldest to youngest/youngest to oldest
- How many articles to get
- Passwords probably shouldn't be stored in database. As first pass,
try removing them from the database. Put them in another file, and
add functions to read and write passwords.
Plus, it's possible that at some point it'll become
necessary/desirable to use authentication methods other than
username/password, possibly involving certificate exchange or
something.
We already have a server-side secret string. Could encrypt
feed passwords with that string.
- Should be able to specify password for all of livejournal.com, not
have to specify it separately for every LJ feed.
Use parse_url() to get hostname.
LJ friends feeds in OPML format:
http://www.livejournal.com/tools/opml.bml?user=arensb
- SQL schema should support folders, i.e., user can group feeds into
categories. By default, all feeds go in a " root" folder (the
leading space marks this as special; strip space from user-defined
folders).
Folders can contain subfolders, and so on. Add a table saying
which feeds and folders belong to which other folders, much like
/etc/group. That way, a feed can be in multiple groups. (Though will
need to make sure to avoid loops, e.g., groups containing themselves
(perhaps indirectly), or at least avoiding infinite
loops/recursion.)
Need a function to figure out the reverse: given a group,
figure out which feeds are in it (is this worth caching?).
AFS's pts uses negative UIDs to represent groups. Perhaps the
same can be done here, in the table that says which feeds/groups are
in which groups.
- Ajax: user should also be able to drag feeds around, so they're
listed in arbitrary order. Need an "nth" field somewhere, giving the
numeric order in which the feed is listed.
Ideally, it should be possible to have a feed in multiple
folders (e.g., Pharyngula goes in both "Science" and "Atheism", or
Daily Kos goes in both "Politics" and "Stuff I Read Daily"), so need
a separate entry (and nth field) for each instance. And each user.
- LiveJournal plugin (or something): would be nice to be able to say
"I'm user So-and-so at LiveJournal" (or other site using the LJ
code) and have it automatically subscribe to your friends feeds
there.
Need to somehow keep track of the fact that these feeds were
auto-generated. If I change my password on LiveJournal, I should
only have to change it once (in the LJ plugin config) and have the
engine figure out that the password for all the LJ-friends feeds has
changed as well.
- Multi-user support: this would be nice. It would also be nice if the
back-end could avoid duplicating information, i.e., not store two
records for the same article.
However, for now, there is enough user-specific information in
the schema for both feeds and items that this isn't practical. Start
out with naive implementation, and see if anyone starts using it on
a multi-user system.
- Killfile: automatically mark as read messages that match certain
criteria.
Perhaps this should have a scoring system or something: I'd
like to killfile the stories on dKos that match /\w\w-\d\d/ (because
those talk about specific senate/house races that I generally don't
care about), but show the ones that match /MD-\d\d/, since I do care
about those. Or kill the "diary rescue" threads, unless they contain
"arensb".
Should probably be done in plugin.
- Smart groups: like iTunes smart playlists, or killfiles (above):
automatically populate a group based on user-specified criteria.
In fact, ordinary feeds could be implemented with this
mechanism: the ordinary dKos article list could simply be the set of
all articles that come from the dKos feed.
- Don't want to automatically display a feed's image: some are good
(like LJ icons), others are annoying, like FeedBurner feeds that
just have a "Feed powered by FeedBurner" image.
Make this a per-feed customizable option.
- Provide hooks for various plugins to do their thing.
- After adding or updating a feed in the database. Perhaps to
add fields to a custom table.
Could use this to automatically delete subscription to
feeds that haven't been updated in a while, e.g., comment
threads.
- After adding or updating an item in the database. Perhaps to
add fields to a custom table.
- Mark items as read as they come in, based on subject or
category or whatever.
- Before marking an item as read.
- Before deleting an item in the database. Perhaps to clean up
custom tables.
- Before deleting a feed in the database. Perhaps to clean up
custom tables.
- i18n.
Can start by identifying translatable strings and marking
them, so they can be translated later. Define dummy _() and N_()
functions.
Can't trust things like $LANG and $LC_* to decide which
language to translate to. Probably ought to try to guess from
browser settings. Failing that, use a cookie or an option in the
database.
- Browser IDs:
Wii browser: