Saturday, May 16, 2009

Third Party Comment Engines

I've been playing around with different commenting engines on tinyvid.

Currently I use Intense Debate. An example of a video using this comment system is here. I like this service a lot. You can sign on with OpenId, the moderation tools are good and it seems to be generally reliable.

Intense Debate has support for plugins that anyone can write. Potentially I could write a video plugin that allowed embedding <video> into comments for example.

But it does have some issues. Sometimes comments don't seem to appear in the right order. I'll get an email that a comment was left, visit the page, and it's nowhere. Somehow it gets buried amongst other older comments. This started happening when I switched the ordering of the comments to have the most recent comment at the top.

Sometimes, depending on the content of your comment, the engine gets confused and the comment is scrambled. This seems to happen if I edit a comment that has a URL in it. It converts the URL to an HTML <a href/> and when editing and saving seems to reconvert, or do some weirdness with it. I've lost comments through this. Long comments can get silently truncated or trashed.

I set up tinyvid so I can easily add different comment engines and switch them on and off for individual videos for testing. I recently tried js-kit. An example of video with js-kit comments for testing is here.

I like js-kit too. I'm using their 'rating' widget and that seems to work nicely. The comments are ok too. You can sign in with OpenId. But there is some weirdness there. If I sign in using OpenId, leave a comment, then revisit the page later, then the comments don't appear. I have to delete the js-kit cookie to have them appear again.

There is a limit to the size of the comments, similar in size to Intense Debate's, of 3,000 characters. I suspect this limit isn't a problem in practice however.

There's a lot to like about js-kit, apart from the OpenId issue, but I need to try it out a bit more to see what other issues it has.

The last one I tried was Disqus. Example here. The Disqus site lets you register using OpenId. However you can't login to leave a comment using OpenId. You must login using a Disqus Id that gets assigned to you when you've registered.

When I registered using OpenId, then tried to login to leave a comment it requested a password. I had to go back to the Disqus site, change my profile to add a password. This seems sub-optimal since the point of me using OpenId is to avoid multiple passwords and accounts.

Disqus requires anonymous posters to leave a valid email address. I couldn't find a way to turn this option off. I'm sure this will just result in people leaving bogus email addresses.

One step of the Disqus setup requires adding a script block to the page that does document.write. This won't work on XHTML sites. And I dislike document.write usage in general. Thankfully this step appears optional. It's for converting marked up items in the page to a comment count.

After leaving a comment Disqus does a full page refresh. This unfortunately results in the video on the page restarting and reloading.

Apart from those issues Disqus seemed quite nice. Simple to set up and configurable.

So the end result is I'm not sure which is best. I'll play around with them for a bit more and get feedback. Are there any other's worth trying?

Categories:

Monday, May 11, 2009

Simple Nanojit Example

I've been playing around with Nanojit, thinking about using it in a toy project of mine.

Nanojit is a library for generating machine code with backends for x86 and ARM. It's used by Tamarin and TraceMonkey for their just in time compilers.

The Mozilla documentation contains some documentation which gives an overview of what's there:Currently Nanojit seems to be directly included in both the Tamarin and Mozilla repositories. To make it easy to reuse in my various test projects I split out the nanojit code and put it in a github repository. The build is a simple makefile and is only setup for x86 since that's that platform I'm trying it on for now.

In the 'example' directory of the nanojit github repository is the example from the Mozilla documentation. This is what I'm basing my usage of it from.

As a simple test of using it I wrote an expression evaluator. You enter simple mathematical expressions using addition, subtraction, multiplication and division at the command line and it compiles the expression into machine code using nanojit and calls it to get the result.

I started with an interpreter first and added the compiling once that was working. To parse the expressions I used Ian Piumarta's Parsing Expression Grammar library. I tweaked it a bit to be able to use it from C++. The grammar is simple and produces a basic Abstract Syntax tree:
Expr      = lhs:Product (  "+" rhs:Product  { lhs = new Add(lhs, rhs); }
| "-" rhs:Product { lhs = new Subtract(lhs, rhs); }
)* { $$ = lhs }
Product = lhs:Value ( "*" rhs:Value { lhs = new Multiply(lhs, rhs); }
| "/" rhs:Value { lhs = new Divide(lhs, rhs); }
)* _ { $$ =lhs }
Value = Number _
Number = < [0-9]+ ('.' [0-9]+)?> { $$ = new Number(atof(yytext)); }
Running this through the 'leg' program produces C code that can be included in the project to compile the expression entered by the user. The interpreter just walks over the AST, calling an 'eval' method, to produce the result. Code looks like:
float Number::eval()  { return value; }
float Add:eval() { return lhs->eval() + rhs->eval(); }
float Subtract:eval() { return lhs->eval() - rhs->eval(); }
float Multiply:eval() { return lhs->eval() * rhs->eval(); }
float Divide:eval() { return lhs->eval() / rhs->eval(); }
int main() {
Object* result = parse();
cout << "Result: " << result->eval() << endl;
return 0;
}
Converting this to generate machine code and call it is pretty easy. Instead of an 'eval' method I created a 'compile' method. 'compile' takes an object as an argument that I call methods on to generate Nanojit intermediate representation (LIR). It returns the resulting generated LIR instruction. The code for the Number AST object looks like:
LIns* Number::compile(LirWriter* writer) {
return writer->insImmf(value);
}
The LirWriter object contains methods to generate LIR code. For Number we generate an immediate float value and return a pointer to the generated instruction.

The operators are implemented in a similar manner. I'll just show the code for Add:
LIns* Add::compile(LirWriter* writer)
{
LIns* a = lhs->compile(writer);
LIns* b = rhs->compile(writer);
return writer->ins2(LIR_fadd, a, b);
}
Here we compile the left and right hand sides of the expression, saving pointers to the generated instructions. Then we generate a 'float add' LIR instruction. This is passed the instructions that were generated for the left and right parts of the expression. By calling compile on the topmost AST object we end up generating the code for the entire expression.

The code required to set up Nanojit and create the LirWriter is a bit more complicated. Most of what I did was copy and pasted from the example code. Basically I create a Fragmento object and a LirBuffer. Then each time an expression is entered I create a Fragment object which will contain the final generated code and a LirBufferWriter which is the base LirWriter object used by the 'compile' methods.

LirWriter's can be added to a LirBufferWriter to form a pipeline that instructions go through to do different things like optimisation or output debugging code before ending up in the LirBuffer. After this I generate a 'start' instruction, call the 'compile' method to do the rest, and generate a required epilogue instruction:
Fragmento *fragmento = new (gc) Fragmento(&core, 20);
LirBuffer *buf = new (gc) LirBuffer(fragmento, NULL);
...
Object* result = parse();
...
Fragment *f = new (gc) Fragment(NULL);
LirBufWriter writer(buf);
...
writer.ins0(LIR_start);
writer.ins1(LIR_fret, result->compile(&writer));
...
f->lastIns = writer.insGuard(LIR_loop, writer.insImm(1), rec_ins);

// Compile the fragment.
compile(fragmento->assm(), f);

// Call the compiled function.
typedef JS_FASTCALL float (*ExprFunction)();
ExprFunction fn = reinterpret_cast(f->code());
cout << "Result: " << fn() << endl;
The expression evaluator interpreter and compiler example can be downloaded and played with by cloning my 'play' repository on github. Once cloned you'll need to initialize and update the submodules to pull in the nanojit and peg third party libraries that it uses:
$ git clone git://github.com/doublec/play.git
$ cd play
$ git submodule init
$ git submodule clone
Run 'make' in the 'expr-interpreter' and 'expr-compiler' directories and run the 'expr' program that results. If you compile both nanojit and expr-compiler with DEBUG defined you get some debug output. It show's the LIR and the assembler code generated. For example, the LIR for: 2+3+4*2 is:
start
#40000000:0 /* 2 */
#40080000:0 /* 3 */
fadd1 = fadd #0x40000000:0, #0x40080000:0
#40100000:0 /* 4 */
#40000000:0 /* 2 */
fmul1 = fmul #0x40100000:0, #0x40000000:0
fadd2 = fadd fadd1, fmul1
ret fadd2
The assembly code generated for this is:
mov -8(ebp),0                  
mov -4(ebp),1073741824
mov -16(ebp),0
mov -12(ebp),1074266112
fldq -16(ebp)
fadd -8(ebp) f0(#0x40080000:0)
fstpq -8(ebp) f0(fadd1)
mov -16(ebp),0
mov -12(ebp),1074790400
mov -24(ebp),0
mov -20(ebp),1073741824
fldq -24(ebp)
fmul -16(ebp) f0(#0x40000000:0)
fadd -8(ebp) f0(fmul1)
It's easy to change the code to use integer math instead of floating point. Use LIR_ret instead of LIR_fret for the return, and use LIR_add, LIR_sub, etc. This is the assembly code generated for '2+3+4*2' using integer math:
mov eax,2                      
add eax,3 eax(2)
mov ebx,4 eax(add1)
mov ecx,2 eax(add1) ebx(4)
mul ebx,ecx eax(add1) ecx(2) ebx(4)
add eax,ebx eax(add1) ebx(mul1)


Categories:

Labels:

Friday, May 08, 2009

Theora encoder improvements

A lot of work has been going on improving the Theora video encoder. Given a better encoder we end up with better quality Theora videos without the need to change the decoder in existing players. The project name for the Theora encoder improvements is 'Thusnelda'.

Periodic updates have been released by Xiph showing the progress of the new encoder. The latest update makes interesting reading and show's graphs comparing it against other Theora builds, and against x264 (an H.264 encoder).

The update show's that some great improvements have been made, and in some cases is comparible to the results from the x264 encoder. Greg Maxwell gives some more information in this reddit comment about the results.

Remember that the intent of this is not to compare Theora vs x264, or to say one is better than the other in the general case. It's to show the improvement of Theora vs older Theora versions. The fact that it compares favourably against x264 in this one case, using one particular test of quality, is a good result though.

Categories: ,

Labels: