Stuff for Working With Structured
Information Manager(SIM) |
UNIX Bits
-
Find a File Anywhere- down in a directory tree (below you), such
as here with gunzip: find . -name 'gunzip' -print
-
Get a line count of how many files/directories are in a directory-
ls -l | wc -l
-
get a list of everything below me, and dump it in a file called "filename"
-
ls -lR > filename
- Run a job in the background from the beginning Such as with the
Java string below, type all that in, then leave a space at the end, and an
&
-
After setting path, running XT: java (if memory, then add like -ms60M
-mx400M depending on server) com.jclark.xsl.sax.Driver file.xml script.xsl
out.xml
-
ftp in local directory
: After doing a run ftp domain.net, then lcd to c:\whatever to set
local directory, run bin, and then do get/put.
-
ftp multiple files
: Set teh prompt to non-interractive by typing "prompt" and return,
then use "mget *.*"
-
Unix File compression
-
tar cf archive.name.tar files (e.g., tar cf bigstuff.tar all*.*)
-
Check what you have in its contents with tar tvf bigstuff.tar
-
then to compress it, do "compress bigstuff.tar" and you'll get
"bigstuff.tar.Z"
-
recursively delete below you
: rm -rf
-
make a symbolic link to some other directory, called "simservers":
ln -s /opt/sim/servers/ simservers
-
concatenate some stuff into a file: cat file.txt file3.txt newfile.text
-
How much space is free? df -k
-
Make a directory? mkdir name
-
Find out what processes are running?
ps -elf
- Find out how many resources one process is taking
-elf | grep process
-
Move or Rename a file?
mv original new
-
How much space does this file/directory take? - du -sk name
-
Get rid of all stuff in folders below where you are
-
Change permissions?
chmod 777 sets all read, write, executable, and chmod -R 777 does to all
in tree below you
-
Find out the environment?
env
-
Change the current path for a session?
alias thing cd /frequently/used/path
-
Reload after a change to .cshrc?
source .cshrc
-
make and automated alias to another directory by just typing its name-
change your .cshrc: alias what_I_type 'cd /directory/of/choice'
-
protect yourself from "rm" mistakes add "alias rm "rm -i"" to your
.cshrc; will prompt to ask if you are sure.
- set dos-like keys in Unix stty erase [then hit the "delete" key
to set it, or bksp]
vi Quickies (vim too)
-
move about esc, arrows --also--top/bottom of file esc,
shift-g to end of file,
esc, 1 shift-g to top of file
-
delete what character the cursor is on- esc, x
- escape a character such as wehn replacing & with &, use
this to do so -- \& to escape it -- use the \ after the / coming before the
replace, such as: esc, : %s/&/\&/g
- delete to end of line use esc, .,$ d
- GO To line # just type the number, followed by a shift-g
-
new line and start typing on it right away, from anywhere, without inserting
hard return, start new below line you're on without splitting the one you're
on - esc, o then start typing
-
copy a line - esc, shift-y,p
-
cut a line - esc, dd
- remove ^M character esc, :1,$ s/control-v, ret//, then return.
Control-v and return makes the ^M character.
-
replace a character while in escape - esc, arrows to character,
r, then whatever you want it to be
-
search- esc, / then type the word sought
-
search and replace in a big file - esc, : % s/find/replace/g
the s says to substitute, the g says repeat if twice on one line, a space
comes after the :, and no spaces after the %, the s, or the /
- Replace with a hard-return This will find an xml tag, and
insert a hard return before it (switch to "a" from "i" for putting it
in afterward)
- esc :map @ /^Mi^[#
-
-- the ^M comes from
hitting
control-v-rtn, and the ^[# comes from cntrl-v-esc --
-
what you're doing is
making a macro, so you have to have it repeat, by next doing esc :map #@
-
test it first with esc :map # jk (tests it on just one line)
-
save current file - esc, : w
-
save current file with another name - esc, : w filename.txt
-
save file with current name and EXIT - esc, shift zz
-
open another file- esc, : e
-
open another file without saving current one esc, :edit!
-
quit- esc : q
ACE and SIM Operating Stuff
XML
-
Buildwiz (7709 port) for XML-
-
copy /opt/sim/r3.0/dtd/xml8.soc into your buildwizard tmp directory and
call it "my_xml_database.soc"
-
run the buildwizard as usual (assuming your dtd and source are in the buildwiz
tmp directory, of course)
-
and then open your resulting .ddl file (called my_xml_database.ddl) in
vi or emacs and get the line that says: "FragmentSGML" SGML MIME "text/sgml"
PHYSICAL and make it read: "FragmentSGML" SGML OPTIONS "Catalog=my_xml_database.soc"
MIME "text/sgml" HIGHLIGHTTEXT "<?SIM HI=\"ON\"?>" "<?SIM HI=\"OFF\"?>"
PHYSICAL
-
this just says, yes, it's sgml, but use this other catalog file, putting
SP into XML mode
-
alternately (untested), change it to: "FragmentSGML" XML MIME "text/xml"
PHYSICAL
-
Then just run the usual simddl cms my_xml_database.ddl from the
tmp directory
MARC & Z39.50
-
Z39.50 - suports version 2 of Z39.50 well enough, and by default
-
version 3 defaults and the OID (Object Identifier) options for each
Bib-1 set
-
allows RPN QUery Type 1 (Rev. Polish Notation), and Query Type 2 (CCL,
cf. ddl files)
-
SimCLI learns via Explain Facility - you can use the line DDL CREATE CCLINFO
for internal attention to CCL equivalencies or edits you are making
-
abstract record structure is manipulated for multiple tag sets via DDL
CREATE SCHEMA Author BIND "bib1-" "author" [newline] BIND "GILS"
"origin"
-
Full or Brief SUTRS returns, default for GRS-1
-
Suppord for all Z39.50 facilities, even a well-fleshed Explain and Extend
Searching
-
Fuzzy Searching - "title=@fuzzy(cut,90) -- this might give "cat"; where
"90" is the accuracy in percents and "cut" is the parameter of the word
sought
-
stemming - "dog~" gives singular and plural (cf. Z39.50 truncation), all
this is done with CCL/ISO 8777 syntax= Common Command Language)
-
pattern matching - wom#n uses "#" to match any single character, and ?
matches/allows more than onewildcard character
-
Field searching - using CCL: TITLE=Smith (cf. "TITLE LIKE '%Smith%' " in
SQL) -- but - also AUTHOR,SUBJECT=Smith, (not possible
in SQL).
Whatnot
-
Index Implementation- indexing only on indexed fields, records are not
examined in queries, insuring performance - indexes are a B-tree and a
postings file, the B-tree has multiple nodes,
-
SIM indexes the vocabulary and #'s, this forms the B-tree, then maps where
they are, forming teh postings; for scanning per field (cf. Z39.50 Scan
facility)
-
Databases can have multiple indexes with different levels and granularity,
thus allowing concurrent markup and/or architectural forms-type solutions.
-
It helps to have the line: /opt/sim/r3.0/bin $path in the
.cshrc of the user/admin,
-
Also in .cshrc add a setenv MANPATH \ /opt/sim/r3.0/man:$MANPATH
-
a symbolic link to ace in the buildwiz helps, if put in opt/sim/servers
for version 3.0.3: ln -s /opt/sim/r3.0/install/buildwiz/ace ace
-
SERVER BITS and Parts:
-
simwebs is the web user interface;
-
simcms is the repository and retrieval (the core of it all, you
can have multiple machines for your simcms),
-
simdirs is the direcotry server telling other servers where everything
is;
-
simsls is the security and logging server
-
simboots- bootserver starts first, and gets others going-- on UNIX
only
SIM Goodies, add-ons, Unicode,
features
-
RDF Engine/webCrawler/Putator- SIM does a web-crawl, looking
in HTML for meta data, "title" tags, "h1" tags, etc. This is pumped
into a "putator" with a logic applicator (you can tailor this), to make
the RDF tag set, via putative metadata (metadata generated implicatively),
goes into a classifier (per topics, etc.) and then available through the
cms.
-
SIM Publisher- makes a standalone database instance on CD, fully
runnable.
-
SDI: Selective Dissemination of Information- included "out of the
box" - notify of updates/changes in query history
-
Unicode:
-
All Java strings are 16-bit in string reading
-