abma.x-maru.org Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile
 Log in to check your private messages   Log inLog in 

Possible New Animeusenet.org Service: Realtime(-ish) Logging
Goto page 1, 2  Next
 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    abma.x-maru.org Forum Index -> Site Stuff
View previous topic :: View next topic  
Author Message
matt



Joined: 11 Feb 2002
Posts: 34
Location: Cleveland Ohayo

PostPosted: Sat Oct 04, 2003 6:17 pm    Post subject: Possible New Animeusenet.org Service: Realtime(-ish) Logging Reply with quote

I am testing a new script at animeusenet and would like to make it a part of the site but I thought I would get some feed back from you guys.
http://www.animeusenet.org/live.php

It’s a script that updates every hour with the recent posts from the news groups, it covers ABMA, ABA, ABMAR, ABMAR(raws) ABAT, ABAV.

The basic idea is that the posts would accumulate during the day in this script and then every night would be added into the searchable history archives. The script allows for any visitors who has edit access (available to anyone upon request) to be able to edit the posts information (if the visitor sees something wrong). This would completely change our current logging process and pretty much eliminate the current manual process (yay!).

I gave a few people edit access that are active on this forum and have animeusenet accounts so you could see how it would work.

So the question is, would this be useful to anyone or is it overkill and you’re happy with the nightly updates?

//----how the script works, stop reading now if you could care less----

Every hour at 10 till the script connects to a giganews server, and downloads the latest headers (it stores the headers it uses to eliminate the need to redownload). It then looks for posting of RAR archives (.rar, .part01, .001, etc.). It then downloads the body for that article and decodes the rar file (well a piece of it). Out of that rar file it pulls the video file name, file size and video codec. (one problem I have is that if there is more then 1 file packed into the rar, such as a .sfv file it will not be able to pull the data out, I’m still working on that one)

To find the title of the anime I have a def file with over 600 titles in it the script looks though the file and compares the file name to it trying to figure out what it is.

To find the ep number the script has about a freaking bazillion regular expressions that try to figure it out. (doesn’t always work)

I also have a def file for the fansubers.

It complies all that data together and updates the table that’s reflected on live.php. Any data that is missing will hopefully be added in by visitors.

That's it, thanks for any feedback,
matt
Back to top
View user's profile Send private message Send e-mail
Gorunova



Joined: 10 Feb 2002
Posts: 318
Location: Burnaby, B.C., Canada

PostPosted: Sun Oct 05, 2003 6:53 pm    Post subject: Reply with quote

I'm surprised at how well it seems to work already, Matt. Good job.

My main concern would be the rate at which people correct the data and accept it into the main database. It should be easy for them to type in corrections and maybe even select close matches from the alias list for the series title.

I wonder if it might be possible to suggest an easily parsable subject line format and include that as the suggested format in the FAQ and in inc's NAGs, as a way of increasing logging accuracy.
Back to top
View user's profile Send private message Send e-mail
matt



Joined: 11 Feb 2002
Posts: 34
Location: Cleveland Ohayo

PostPosted: Sun Oct 05, 2003 7:15 pm    Post subject: Reply with quote

Thanks,

Gorunova wrote:
My main concern would be the rate at which people correct the data and accept it into the main database.


The data would sit in the "live" table untill an admin (probably me) commits all the posts for that day into the searchable main database. So the timing should not be an issue.
Back to top
View user's profile Send private message Send e-mail
xo
Site Admin


Joined: 09 Feb 2002
Posts: 466
Location: Los Angeles [comcast]

PostPosted: Sun Oct 05, 2003 11:13 pm    Post subject: Reply with quote

Gorunova wrote:

I wonder if it might be possible to suggest an easily parsable subject line format and include that as the suggested format in the FAQ and in inc's NAGs, as a way of increasing logging accuracy.


Oh, but that would be too easy! Stop teasing! Some of us are still doing this by hand! Wink

matt, I publically bow before your scripting might. That's an insane task you took and that you pulled it off speaks volumes. You should go work for Google or something.

-xo
Back to top
View user's profile Send private message Send e-mail Visit poster's website
oblio



Joined: 20 Feb 2002
Posts: 106
Location: Detroix, MI

PostPosted: Mon Oct 06, 2003 5:11 am    Post subject: Reply with quote

I pull all my anime with a perl script I wrote that uses regex's to grab shit. What an unholy pain in the ass. good job at going the extra yards for unwinding the rars for internal filenames...

Despite what xo says about standard subject lines, if you could throw out a couple suggestions about what makes it easiest for you, I wouldn't mind using it.
Back to top
View user's profile Send private message Send e-mail
matt



Joined: 11 Feb 2002
Posts: 34
Location: Cleveland Ohayo

PostPosted: Mon Oct 06, 2003 9:03 am    Post subject: Reply with quote

If your checking the page today (the 6th) I am having problems with the server right now so it wont be updating for the next few hours.

oblio wrote:
Despite what xo says about standard subject lines, if you could throw out a couple suggestions about what makes it easiest for you, I wouldn't mind using it.


Really as long as there are no "REQ: PLS POST NARUTO EP 51!" (requests) in the subject the script should, as long as it has that titles name in the def file be able to pull out the name. Short of make the subject line "Title=blah Ep=#blah" I'm not sure I could make it much better.
Back to top
View user's profile Send private message Send e-mail
(inc)



Joined: 18 Feb 2002
Posts: 356
Location: San Diego

PostPosted: Mon Oct 06, 2003 2:08 pm    Post subject: Reply with quote

Quote:
as long as there are no "REQ: PLS POST NARUTO EP 51!"
Taking requests out of binary subject lines has been a nag since their formal start (1/1/03 -- easy to remember). I had the feeling -- purely subjective -- that there was actually some success. That is until this Fall. Now every noob poster in abmar seems to including requests. Razz Hehe... I'm even getting flamed about it -- very strange. Cool

What I had been considering doing even before I saw this thread was changing the nag to reflect more the issues with AnimeUsenet and less the inconvince to leeches -- stressing that it's to the poster's own benefit not to do REQ's in binary subject lines. Trying to institute a strict *standard* subject with the nag may be asking too much (in fact I'm sure it is), but I'm willing to make the attempt to get folks to make their SLs as parsable (sic??) as possible -- the return value, if it works at all, may be worth a few *slings & arrows*.

(inc)
Back to top
View user's profile Send private message
Keikai



Joined: 18 Feb 2002
Posts: 178
Location: Miami, FL

PostPosted: Tue Oct 07, 2003 12:19 am    Post subject: Reply with quote

Aye, I took that attitude with subject recommendations in the FAQ. I tried to reason that what helps animeusenet, helps us all. And I put in some recommended tips for writing subjects along those lines. But, it's at the very least a good argument to add to your anti-REQ-in-subject NAG.

Unfortunately, unstituting formalized subject lines is just not going to happen, much as I'd be happy to see it.
Back to top
View user's profile Send private message
xo
Site Admin


Joined: 09 Feb 2002
Posts: 466
Location: Los Angeles [comcast]

PostPosted: Tue Oct 07, 2003 11:59 pm    Post subject: Reply with quote

An idea - the "FTD" tag in subject lines seems a fairly innocuous value-added bit that doesn't annoy people too much and actually gets people curious. How about a similar tag for people who want to use a standardized subject line?

Something like:
example wrote:

Love Hina | 24 | xo | 2003-10-04 | ogm divx3 | sub | e-f | 13 | 128MB | +par2 | #AU# - yenc - [01/24] - Love_Hina_-_24[e-f].part01.rar (*/24)


where the "fields" are title, ep, name of poster, date of post, video format/codec, sub/dub/raw, fansub group, number of RAR parts, video size, and additional comments such as inclusion of pars, version designations, etc.

matt and Gorunova will recognize this - this is the internal format we use(d) for logging entries. The separator isn't fixed in stone, but the pipe character is less frequently used than most other characters and the presence of #AU# would help a parser identify pre-formatted entries. Note I put it at the end - I still believe in the usefulness of subject sorting by title, pipe dream that it is. Everything after the #AU# would be software-appended by PP2K or whatever upload software is used.

Maybe this is moot at this point- we're moving toward a more generally automated system and matt's new setup will use broader heuristics than just the subject line. But it could help, and if enough people do it, it might spark curiosity and copycats.

So there's my gauntlet.

-xo
Back to top
View user's profile Send private message Send e-mail Visit poster's website
xo
Site Admin


Joined: 09 Feb 2002
Posts: 466
Location: Los Angeles [comcast]

PostPosted: Wed Oct 08, 2003 12:03 am    Post subject: Reply with quote

Actually, the poster and date can be left out since they can be determined unambiguosly from other NNTP header lines.

-xo
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Melchior



Joined: 19 Feb 2002
Posts: 190
Location: Vancouver, BC, Canada

PostPosted: Thu Oct 16, 2003 9:48 pm    Post subject: Reply with quote

Wow, I'm impressed! Congrats Matt, that's an impressive piece of scriptage that you've got.

I don't really have much else to say about it-- good luck automating Animeusenet-- I can't imagine how much of a pain it is to manually input the day's posts, and anything that helps reduce that labour requirement can only be a good thing!
Back to top
View user's profile Send private message Visit poster's website
(inc)



Joined: 18 Feb 2002
Posts: 356
Location: San Diego

PostPosted: Mon Feb 23, 2004 8:06 am    Post subject: Reply with quote

Hi Matt,

Wondering why the script wouldn't like InuYasha -- ep 138 posted right after 137 was bypassed as was ep 139 that I posted yesterday, while ep 140 was numbered "14". Was there something about the subject line...
IY -=- 139 [xvid, sub]__[$1/$2] <yEnc> - Inuyasha_139_(Ani-Kraze).part...
...or the file name:
(Ani-Kraze)Inuyasha_139[XviD]_[9385E74E].avi
...that might give a problem?

Or might it have been a problem with incompletes on the *server(s) of record*?

I'm perfectly willing to change to anything that might yield better results.

And I really need to check the tracker everyday -- should have caught and edited that "14" myself.

Just occurred to me that maybe 138+ might be filtered as being too high for an episode number, and thus be labeled an error, lol.

(inc)
Back to top
View user's profile Send private message
matt



Joined: 11 Feb 2002
Posts: 34
Location: Cleveland Ohayo

PostPosted: Tue Feb 24, 2004 8:49 pm    Post subject: Reply with quote

Thanks for checking inc. I looked at the log file for that day. Basically what happened is that I never got around to updating the script to check for 3 digit ep numbers. So it saw Inuyasha 137 and 138 as "Inuyasha Ep 13" the script does not allow the same title and ep from the same poster to get added in the same day(it saw it as a dupe post). So it let 137 in but deleted 138.

As for the 139 ep I think what happened is that you started posting it on late night Feb-21 (EST time) the script kicks off at 15 till the hour, so I believe it ran before you started posting, so it missed it, and it did not pick it up the next time it ran because it looks for the rar file which was then in the previous days headers.

So the solution is, stop posting so much!

Haha, kidding Very Happy

I’ll update it to allow 3 digit eps and look at tweaking the run time so it does not miss anything.

Thanks for checking your posts on the site, allows me to fix these little bugs.
matt
Back to top
View user's profile Send private message Send e-mail
(inc)



Joined: 18 Feb 2002
Posts: 356
Location: San Diego

PostPosted: Tue Mar 02, 2004 4:59 pm    Post subject: Reply with quote

Been doing a few of the *fixes* on <unknown>'s on the Hourly updates as the opportunity presents itself -- hope that's alright.

I noticed my post of FMA 21 (A-Keep/ANBU) of 3/1-3/2 didn't get listed; another *boundry* instance? It started just before and was posted concurrently with the Planetes 11 that did get listed.

Hate to keep buggin' you... lol.

(inc)
Back to top
View user's profile Send private message
(inc)



Joined: 18 Feb 2002
Posts: 356
Location: San Diego

PostPosted: Tue Mar 02, 2004 7:56 pm    Post subject: Reply with quote

Here's one more for you, matt. Been reposting some Wolf's Rain in abmar: 14 & 20-22 so far (around 2/26-2/27 for the last 3). For some reason none of the last 3 where caught by the script. The format was slightly different as the subber changed, but the rest was basicly the same.

(inc)
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    abma.x-maru.org Forum Index -> Site Stuff All times are GMT - 8 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group