Log of the #blacklight channel on chat.freenode.net

Using timezone: GMT-05:00
* bess joins00:20
* bess leaves00:53
* Guest93282 leaves03:28
<erikhatcher>asked over in #code4lib, but i'll ask here too:08:35
solrmarc question.... i'm using the GenericBlacklight configuration, and indexing the 5M+ open Talis MARC file, and it's giving me an error on subtitle_display not being multivalued.
[08:31a] erikhatcher:
question is, how to make subtitle_display = custom, removeTrailingPunct(245b) only return a single val
* BillDueber joins09:35
* tachyonwill_ joins10:11
<jrochkind>erikhatcher: I don't believe you can without writing a custom method (in either Java or BeanShell). I ran into that exact same thing too. You can't combine removeTrailingPunct and 'first' right now, in the standard logic. But you can write your own method. 10:18
(incidentally, I think it's semantically illegal for a marc21 record to have more than one 245b. But my corpus includes such as well. )10:19
<jvenner>Good morning10:25
Does anyone have an example that allows for more facets to be displayed?
using 2.410:26
<jrochkind>jvenner: not sure what you mean. I think I might? What do you mean by more facets?
Oh wait, you mean a link to 'more' that shows you all values?
<jvenner>right now when I pop open one of the facet labels, there are the top N facets
<jrochkind>Right, okay. 10:27
<jvenner>yes
<jrochkind>I am right now working on code to submit to Blacklight to make that built-in. But right now it's not. And not particularly easy to write.
<jvenner>Makes sense
The solr interface is very stylized it seems
<jrochkind>UVa has done it in their own custom code. I wasn't succesful at getting them to share their code with me, but they shared the principals behind it, which I'm using to write code which I plan to submit back to Blacklight.
<jvenner>My other issue is I want to use the date range facet support in solr 1.4
<jrochkind>jvenner: I thought about the date range thing too. I _think_ that should be do-able. Not sure if you'll have to hack Blacklight, or can just set up Solr to do it and it'll Just Work?10:28
<jvenner>I have date fields that I index as tdate but I only get exact match facets
so clearly I am missing something
I remember trying in 2.3, to force date range facets in the solrconfig.xml but it didn't work10:29
<erikhatcher>jvenner: i don't suppose you'll be at the code4lib conf next week?
<jvenner>teehee
I don't even consider myself speaking ruby at present
java, c++/c, perl are my friends
in fact I am a big contributor on solr 1301 and a couple of other solr at scale patches10:30
to honk my own horn for a moment ;) siince I feel so ruby useless
<erikhatcher>jvenner: you rock10:31
<jvenner>haha
thanks erichatcher
<jrochkind>jvenner: at some point I might get around to figuring out the date range thing, and modifying BL as neccesary to make it Just Work. But not sure when. :) 10:33
<jvenner>I have another odd ball blacklight issue, I have an index split into 9 shards, with a 10th instance that just distributes the queries to the shards
for some reason the initial facet count always comes up as if there was a query that restricts the results to a very small subset10:34
* jrochkind knows NOTHING about shards, sorry.
<jvenner>however regular searches produce the full space
my index is 1.3TB, I have to shard it
I look forward to the date facet code jrochkind
<erikhatcher>jvenner: you are seeing the sharded facet issue with raw queries to Solr? or only through Blacklight?10:35
<jvenner>I have to double check
if it is at the solr level then it is a solr problem...
<erikhatcher>right, that's what i was trying to peel away
<jvenner>just going to sort that out10:37
<jrochkind>jvenner: can't promise the date facet code, but I'm trying to get the 'more' facet code finished up as we speak. 10:38
<jvenner>do you know why sometimes firefox renders the xml and sometimes it doensnt? when doing raw solr queries
more facet is rockstar!
everyone asks for that10:39
and I can fake the date facet stuff with psuedo range fields
at index build time
<erikhatcher>jvenner: if you bake your facet config stuff into your request handler mappings then it's simply a display problem, right? seems tractable10:40
<jvenner>I thought I tried it, but at the time I did, not only was my ruby/blacklight skill level 0, my solr config skill was probably only 0.110:41
at this point I am probably 0.2 on the blacklight and 0.4 on the solr
so I have a better chance of not id10t'ing myself
<jrochkind>yeah, it seems like the date range facet stuff should be fairly simple, but wouldn't surprise me if some tweaking of BL code is neccesary. 10:43
<jvenner>I think I had trouble working out how to pass or parse the request/results
erichatcher, do you know how to ask solr to return the shard name, in the responses?
<erikhatcher>jvenner: not currently possible, as far as i know10:44
<jvenner>ahh
<jrochkind>jvenner: I mean, in theory, it would Just Work. Blacklight is already accepting and displaying what solr returns for facet values. First step would be just configuring it on the solr side, and seeing what happens, what Blacklight does with it.
<jvenner>We hacked in into the katta code but it would get eaten at the presentation layer
kk
<erikhatcher>jrochkind: the date range facets are in another area of the response though, not the same as field faceting10:45
but, just a parallel similar data structure
so should be fairly straightforward to hack in
<jrochkind>aha, I see. so, okay, definitely BL patch would be needed. but ideally fairly straightforward. but yeah, there's a learning curve to ruby, rails, and then Blacklight specifically.
<erikhatcher>i'm so out of the loop with BL internals, and Ruby/Rails myself10:46
<jvenner>when I do the low level facet query, I get results across all of my shards
now to backtrack through the blacklight case
<erikhatcher>i'm looking forward to working with folks closely next week to get back into the swing of things and learn how to "git" all this stuff up and running locally once again
<jrochkind>erikhatcher: awesome.
<erikhatcher>jvenner: date faceting isn't distributed aware yet either, so there's that issue
<jvenner>oh interesting10:47
I have 4 datasets I index or hope to index, currently only 1 of them requires sharding
<jrochkind>jvenner: curious what about it requires sharding. it's incredible hugeness? How huge?
<jvenner>the current hot one is < 2GB so I can fit it in ram
1.3TB10:48
I don't have a machine with enough ram to load it ;)
<jrochkind>ah, you want to fit everything in RAM. You're already a step ahead of me, heh.
<jvenner>I have a lot of SSD to play with but not enough to hold it all on a single machine either
it looks like it needs 14G of heap to just load the index
1.3Tb is pretty small in the grand scheme of things
<jrochkind>hmm, that doesn't make sense, I don't think. I mean, for performance you might want to have everything in RAM, but you don't _need_ to for solr -- and I don't think you should need a lot of heap to index it. Not that I'm a solr expert or anything. 10:49
<jvenner>I can't load my index unless I give the JMV 14g of heap10:51
my ops people won't put more than 16 in a box, so that leaves me a bit short ;)
<erikhatcher>jvenner: how many docs we talking?
<jrochkind>something doesn't seem right there. needing 14g of heap to load the index.
<jvenner>just over .5 billion10:52
<erikhatcher>you must be doing some serious sorting and/or faceting
<jvenner>just to load?
<erikhatcher>.5? as in 500M?
<jvenner>yes
<erikhatcher>i'm not sure what you mean by load... you mean you need that much ram just to index?
<jvenner>I build my index in hadoop
<erikhatcher>what data source are you indexing from? indexing shouldn't need much RAM
<jvenner>so my index builds are taken care of10:53
java -d64 -Xmx14g -jar start.jar
to load my already build index
<erikhatcher>500M on a single Solr index?
obviously sharding that should help a lot10:54
<jvenner>yes
so, I think I have the first clue in my facet problem
when I use qt=standard, I get facet across the shards10:55
when I use qt =search, I get a facet for a 'secret' query
<erikhatcher>secret query?
jvenner: check the e-mail list about sharding and the qt parameter. seems like there is something mentioned the other day about that10:56
jvenner: maybe this thread has the key? http://www.lucidimagination.com/search/document/8b6eec5eab3dae47/create_requesthandler_with_default_shard_parameter_for_different_query_parser#a1ae64de1f59aa0210:57
<jvenner>ty,looking
humm, it does seem to be my problem, how to work around it10:58
since the qt=search is pretty much built into bl10:59
and qt=.... for other queries
<erikhatcher>good question11:00
but on that note, i've got to run. crossfit gym time!
* jamieorc joins
* erikhatcher leaves11:01
<jvenner>ty11:02
* jkeck joins12:13
* bess1 joins12:23
* jkeck leaves12:27
* bess1 leaves12:28
* jkeck joins12:47
* ndushay joins13:12
<jvenner>is it safe/reasonable to use a version of gems other than 1.3.1 with blacklight 2.4? 1.3.1 seems to segv with ruby 1.8.7 on my solaris boxes13:21
* ndushay leaves13:24
* erikhatcher joins13:30
<cbeer_>jvenner: i've been using 1.3.5, but i don't know if it is safe or reasonable 13:32
<jvenner>I think theproblem may be in ruby itself
related to my use of an http_proxy
I get segv's from gem when I try to update or install13:33
<jkeck>jvenner: actually rubygems 1.3.5 is required I think.13:39
jvenner: I'm using rubygems 1.3.5 and ruby 1.8.6
<jvenner>oh, I was reading the docs and it stated 1.3.113:40
http://github.com/projectblacklight/blacklight/blob/master/doc/PRE-REQUISITES.rdoc
<jkeck>Ah, ok. Well to answer your question 1.3.5 is perfectly safe.13:41
We're running 1.3.5 in production @ Stanford.13:42
<jvenner>kk13:44
installing hopefully it will work around my segv
probably due to using an http proxy
<jkeck>Hmm, I haven't tried that before, so I unfortunately can't be much help.
<jvenner>how about rails versions?13:45
ruby problem13:48
happens even if I don't set the proxy up
<jkeck>Well that depends on what the environment.rb of your version of Blacklight says.
<jvenner> /usr/local/lib/ruby/1.8/net/http.rb:560: [BUG] Segmentation fault13:49
<jkeck>trunk is rails 2.3.5
<jvenner>ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-solaris2.10]
<jkeck>When does that happen? When you ./script/server ?13:50
* BillDueber leaves13:51
<jvenner>I can't get rails installed13:52
that happens when I run gem install -v 2.3.4 rails
It is in the socket code
bc.so.1`_lwp_kill+7(1, 6)
libc.so.1`raise+0x1f(6)
libc.so.1`abort+0xcd(7273752f, 636f6c2f, 6c2f6c61, 722f6269, 2f796275, 2f382e31)
0x80d5cb7(80ee159, 0, 8038b20, fee72d8f, 8038ad8, fee72d8f)
0x80b2f2a(b, 0, 8038b74)
libc.so.1`__sighndlr+0xf(b, 0, 8038b74, 80b2f10)
libc.so.1`call_user_handler+0x22b(b, 0, 8038b74)
libc.so.1`sigacthandler+0x65(b, 0, 8038b74)
0x2fb6(8038e50, 8038e30, 8039260, 8038e24, 0, 8039298)
0xfeb046fa(8038e50, 8038e30, 8039260, 8038e24)
socket.so`sock_addrinfo+0x11b(2, 0, 1, feffb28c)
s
<jkeck>Uhhhhh, ouch. Have you tried to sudo?13:53
* bess1 joins13:59
* ndushay joins14:24
<jvenner>was running as root actually14:28
* jkeck leaves14:31
* bess1 leaves14:34
* jkeck joins14:45
* BillDueber joins14:49
* bess1 joins
* bess1 leaves14:51
* BillDueber leaves15:10
* BillDueber joins15:34
* rduplain joins15:57
* bess joins16:08
* bess leaves16:29
* tachyonwill_ leaves17:03
* BillDueber leaves17:06
* jamieorc leaves17:10
* bess joins17:49
* bess leaves17:56
* jkeck leaves18:03
* jkeck joins18:04
* erikhatcher leaves18:21
* rduplain leaves18:45
* rsinger leaves18:59
* bess joins19:11
* jkeck leaves19:15
* jkeck joins19:18
* erikhatcher joins19:19
* g8tor joins19:40
* g8tor leaves19:53
* bess leaves20:17
* rduplain joins20:27
* ndushay leaves20:57
* cbeer leaves21:01
* jkeck leaves21:03
* rsinger joins21:37
* jaron_ joins21:46
* jaron_ leaves21:52
* ndushay joins22:01
* rsinger leaves22:07
* rduplain leaves22:40

Generated by Sualtam