Stuff for Working With Structured
Information Manager(SIM) |
Find a File Anywhere- down in a directory tree (below you), such
as here with gunzip: find . -name 'gunzip' -print
Get a line count of how many files/directories are in a directory-
ls -l | wc -l
get a list of everything below me, and dump it in a file called "filename"
ls -lR > filename
- Run a job in the background from the beginning Such as with the
Java string below, type all that in, then leave a space at the end, and an
After setting path, running XT: java (if memory, then add like -ms60M
-mx400M depending on server) com.jclark.xsl.sax.Driver file.xml script.xsl
ftp in local directory
: After doing a run ftp, then lcd to c:\whatever to set
local directory, run bin, and then do get/put.
ftp multiple files
: Set teh prompt to non-interractive by typing "prompt" and return,
then use "mget *.*"
Unix File compression
tar cf files (e.g., tar cf bigstuff.tar all*.*)
Check what you have in its contents with tar tvf bigstuff.tar
then to compress it, do "compress bigstuff.tar" and you'll get
recursively delete below you
: rm -rf
make a symbolic link to some other directory, called "simservers":
ln -s /opt/sim/servers/ simservers
concatenate some stuff into a file: cat file.txt file3.txt newfile.text
How much space is free? df -k
Make a directory? mkdir name
Find out what processes are running?
ps -elf
- Find out how many resources one process is taking
-elf | grep process
Move or Rename a file?
mv original new
How much space does this file/directory take? - du -sk name
Get rid of all stuff in folders below where you are
Change permissions?
chmod 777 sets all read, write, executable, and chmod -R 777 does to all
in tree below you
Find out the environment?
Change the current path for a session?
alias thing cd /frequently/used/path
Reload after a change to .cshrc?
source .cshrc
make and automated alias to another directory by just typing its name-
change your .cshrc: alias what_I_type 'cd /directory/of/choice'
protect yourself from "rm" mistakes add "alias rm "rm -i"" to your
.cshrc; will prompt to ask if you are sure.
- set dos-like keys in Unix stty erase [then hit the "delete" key
to set it, or bksp]
vi Quickies (vim too)
move about esc, arrows --also--top/bottom of file esc,
shift-g to end of file,
esc, 1 shift-g to top of file
delete what character the cursor is on- esc, x
- escape a character such as wehn replacing & with &, use
this to do so -- \& to escape it -- use the \ after the / coming before the
replace, such as: esc, : %s/&/\&/g
- delete to end of line use esc, .,$ d
- GO To line # just type the number, followed by a shift-g
new line and start typing on it right away, from anywhere, without inserting
hard return, start new below line you're on without splitting the one you're
on - esc, o then start typing
copy a line - esc, shift-y,p
cut a line - esc, dd
- remove ^M character esc, :1,$ s/control-v, ret//, then return.
Control-v and return makes the ^M character.
replace a character while in escape - esc, arrows to character,
r, then whatever you want it to be
search- esc, / then type the word sought
search and replace in a big file - esc, : % s/find/replace/g
the s says to substitute, the g says repeat if twice on one line, a space
comes after the :, and no spaces after the %, the s, or the /
- Replace with a hard-return This will find an xml tag, and
insert a hard return before it (switch to "a" from "i" for putting it
in afterward)
- esc :map @ /^Mi^[#
-- the ^M comes from
control-v-rtn, and the ^[# comes from cntrl-v-esc --
what you're doing is
making a macro, so you have to have it repeat, by next doing esc :map #@
test it first with esc :map # jk (tests it on just one line)
save current file - esc, : w
save current file with another name - esc, : w filename.txt
save file with current name and EXIT - esc, shift zz
open another file- esc, : e
open another file without saving current one esc, :edit!
quit- esc : q
ACE and SIM Operating Stuff
Buildwiz (7709 port) for XML-
copy /opt/sim/r3.0/dtd/xml8.soc into your buildwizard tmp directory and
call it "my_xml_database.soc"
run the buildwizard as usual (assuming your dtd and source are in the buildwiz
tmp directory, of course)
and then open your resulting .ddl file (called my_xml_database.ddl) in
vi or emacs and get the line that says: "FragmentSGML" SGML MIME "text/sgml"
PHYSICAL and make it read: "FragmentSGML" SGML OPTIONS "Catalog=my_xml_database.soc"
MIME "text/sgml" HIGHLIGHTTEXT "<?SIM HI=\"ON\"?>" "<?SIM HI=\"OFF\"?>"
this just says, yes, it's sgml, but use this other catalog file, putting
SP into XML mode
alternately (untested), change it to: "FragmentSGML" XML MIME "text/xml"
Then just run the usual simddl cms my_xml_database.ddl from the
tmp directory
MARC & Z39.50
Z39.50 - suports version 2 of Z39.50 well enough, and by default
version 3 defaults and the OID (Object Identifier) options for each
Bib-1 set
allows RPN QUery Type 1 (Rev. Polish Notation), and Query Type 2 (CCL,
cf. ddl files)
SimCLI learns via Explain Facility - you can use the line DDL CREATE CCLINFO
for internal attention to CCL equivalencies or edits you are making
abstract record structure is manipulated for multiple tag sets via DDL
CREATE SCHEMA Author BIND "bib1-" "author" [newline] BIND "GILS"
Full or Brief SUTRS returns, default for GRS-1
Suppord for all Z39.50 facilities, even a well-fleshed Explain and Extend
Fuzzy Searching - "title=@fuzzy(cut,90) -- this might give "cat"; where
"90" is the accuracy in percents and "cut" is the parameter of the word
stemming - "dog~" gives singular and plural (cf. Z39.50 truncation), all
this is done with CCL/ISO 8777 syntax= Common Command Language)
pattern matching - wom#n uses "#" to match any single character, and ?
matches/allows more than onewildcard character
Field searching - using CCL: TITLE=Smith (cf. "TITLE LIKE '%Smith%' " in
SQL) -- but - also AUTHOR,SUBJECT=Smith, (not possible
in SQL).
Index Implementation- indexing only on indexed fields, records are not
examined in queries, insuring performance - indexes are a B-tree and a
postings file, the B-tree has multiple nodes,
SIM indexes the vocabulary and #'s, this forms the B-tree, then maps where
they are, forming teh postings; for scanning per field (cf. Z39.50 Scan
Databases can have multiple indexes with different levels and granularity,
thus allowing concurrent markup and/or architectural forms-type solutions.
It helps to have the line: /opt/sim/r3.0/bin $path in the
.cshrc of the user/admin,
Also in .cshrc add a setenv MANPATH \ /opt/sim/r3.0/man:$MANPATH
a symbolic link to ace in the buildwiz helps, if put in opt/sim/servers
for version 3.0.3: ln -s /opt/sim/r3.0/install/buildwiz/ace ace
SERVER BITS and Parts:
simwebs is the web user interface;
simcms is the repository and retrieval (the core of it all, you
can have multiple machines for your simcms),
simdirs is the direcotry server telling other servers where everything
simsls is the security and logging server
simboots- bootserver starts first, and gets others going-- on UNIX
SIM Goodies, add-ons, Unicode,
RDF Engine/webCrawler/Putator- SIM does a web-crawl, looking
in HTML for meta data, "title" tags, "h1" tags, etc. This is pumped
into a "putator" with a logic applicator (you can tailor this), to make
the RDF tag set, via putative metadata (metadata generated implicatively),
goes into a classifier (per topics, etc.) and then available through the
SIM Publisher- makes a standalone database instance on CD, fully
SDI: Selective Dissemination of Information- included "out of the
box" - notify of updates/changes in query history
All Java strings are 16-bit in string reading