The Guide to the FIP Data Formatting Module

Relevent Program documentation.

HOW TO FORMAT !

Time for Some Golden Rules:

  • 1: Study the INPUT file in detail. * Try to get several input files as you need to know EXACTLY what variations there can be in the input.
    • Try to get a specification for the input if it exists, as that can save hours trying to guess.

  • 2: Study the OUTPUT file in detail. Again a spec is useful.
  • 3: Talk to people about why they did it that way or why they want it that way.
    • If you are trying to copy an existing format, check with the Users what they REALLY need. Often things were done in a strange way because of system shortcomings or user's personalities. Get a different person in charge and they may have different emphases. You may find that there is a subsquent system written on the system you are delivering to that is making a goodmany changes that you could include in your file processing. Similarly you may find that a lot of manual work is done on the file which you could circumvent.

  • 4: Remember to eat Lunch.
    • No system is worth missing food and drink.

Now the Steps :

  • 1. Sort out Copy Flow for both testing and live running.
    • Do you need an 'xchg' before formatting ? It is always a good idea to rationalise the input file as far as possible - running a preliminary xchg may help.
    • Do you need an 'xchg' after formatting ? Again there may be some tdiying up needed once the file is processed that is better suited to ipxchg than to ipformat.
    • What about special processing like needing to 'sort' output ? Do you need to sort the copy inbound ? Do you need to resort it outbound ? Do you need to select specific records ?
    • Does it need mutiple datafomats - occasionally you get a file that can be sorted one way, then formatted to say pick up items between consecutive records, then sorted another way and reformatted for output.

  • 2. Map out what you want to format.
    • It is usually easier to make notes BEFORE starting to test.> Just a few notes is all that are required.
  • 3. Setup and start testing the main Format.

Often is easier to copy an existing Text file from form/text, use it as a template and rework it to meet the new requirements. There are two ways of testing copy 'offline'. At the Unix command line you can use the 'form' program to set yourself up a testing environment.

Alternatively you can use the form module in W4.

  • 4. Work out the other parameters in the Copy Flow. See the relevant section in this guide for an example of Copy Flow. * 5. Release Dual running over a number of days is always advised.
  • 6. Check

It is always worth going back to the users and seeing how things can be improved for them. There may be thigs you have missed, or just things that they now realise you can do, that they have never thought to ask for before.

Pointers

  • If you need to sort the output file : use the 'job' parameter in the text/PROCESS file (see sections Copy Flow and pformd in the Program Section).

  • if you need to select only certain records from the input file matched against a standing list of keys, use ipfsel (see description in the Program Section).

  • if you need to track several input files have arrived over a period of time, flag if they are missing and process if they are there (for instance, a number of postscript files for an OPI), use ipformch, (see description in the Program Section).

  • if the input file(s) are very dirty and inconsistent, clean them up beforehand with ipfprep (see description in the Program Section).

USING 'FORM' TO TEST AND TUNE

There are two interfaces to test & tune your dataformats ... form is a fast command and basic line driven interface that runs under Unix. There is also a web based interface available through W4.

FORM, the user interface for the Data Formatting module, can be used to test and tune a process 'internally' - without actually sending the output file anywhere nor deleting the input. It can also be used to run an XCHG before and/or after the format.

Several tests can be run and their settings are held in a series of Settings files in tables/form/test.

Note at the moment 'form' will NOT scan the PROCESS file and run any pformd jobs (qv) as it is used purely for getting the parameter file in form/text 100% correct.

Firstly you choose an existing Settings file - if there is one.

Then you need to get the input file into the queue in spool you are using for testing.

If it is the first time you will need to 'mkdir' your test area. Generally one has been setup called 'spool/test'.

If the input file is on diskette or on another system, just copy it over. If the file came from a wire service or dialup and is in an FIP Archive log, you can add a destination in the sys/USERS file and resend to it. Generally, if FIippies have been onsite, there is a default one already setup called 'test' which goes through 'ipedsys' to cleanup the filename BUT does not go through any 'xchg' :

	sys/USERS : test=	DP:com2  DQ:2edsys  EQ:test SC:no DC:no DF:testing
This uses edsys/TESTING to place the file in spool/test.

To get into the test bit of 'form', type 'x' at the form prompt. This will reveal firstly which Settings files are available which you can choose from.

The current settings are displayed before the list of options available.

All options are a single character (case-insensitive) followed by 'enter'.

When changing settings, a '*' can be used to list all the files in the relevant directory. Case is only sensitive for filenames.

root @ com_server2/ > form
form-com_server2:x
** Test FORM - Looking for Settings files
-- List of Files in the Queue : /fip/tables/form/test
total 2
-rw-rw-rw-  1 root		93 Mar 17 10:53 RESRAC
-rw-rw-rw-  1 root		94 Mar 17 19:52 RESTEST
** Hit Return to continue .. 

** Test FORM - Choose a Settings files (or return to ignore) :restest
** Test FORM - Existing settings are :
	Working Queue		: /fip/spool/test
	Input file		: rem737
	TEXT Param file		: RESTEST
	XCHG before		: PA2FORM
	XCHG after		 : PA2ATEX
	DU for Live Tests	: sunres


** Options are :
	Change Settings		: C
	List the Working Queue	: L
	Look at the Input File	: I
	 ..  .. the Output File	: O
	 ..  .. the BreakOut	: T
	Edit TEXT file		: E
	 ..  Xchg before file	: B
	 ..  Xchg after file	: A
	 ..  PROCESS for jobs	: P
	Help			: H
	Run a Test		: R
	Send Live Test		: S
	Quit			: Q : 

In (slightly) more detail, the options are : Change Settings- change any of the 4 inputs

Listing the working queue 'ls -l' of the working queue
Look at the Input file More, Dump, Edit, Tail the input which MUST BE in the working queue
Look at the Output file More, Dump, Edit, Tail the output which is hidden in the temp queue.
Look at the BreakOut? file More, Dump, Edit, Tail the breakout file which is that created by IPFORMAT showing how the input is split into fields and records
Edit the TEXT parameter file IPVI the parameter file in form/text
Edit the Xchg before file IPVI the (optional) xchg file to be used BEFORE the format.
Edit the Xchg after file IPVI the (optional) xchg file to be used AFTER the format.
Help More this file !
Run a Test run IPFORMAT and the look at the output file.
Send Live Test copy the test file and send to the DU (destination) specified.
Quit No documentation available on this

In more more detail .....

-- Change Settings :
** Change Settings - Existing settings are :
	Working Queue		: /fip/spool/test
	Input file			: rem727
	TEXT Param file	 : RESTEST
	XCHG before		  : PA2FORM
	XCHG after			: PA2ATEX
	DU for Live Tests		 : sunres

  (At each prompt, Use '*' for list of files)
	Change Working Queue	 : W
	Change Input file		 : I
	Change TEXT Param file  : T
	Change XCHG before		: B
	Change XCHG after		 : A
	Change DU for Live Tests: D
	List the Working Queue  : L
	Quit			: Q or enter : 
Note that a '*' 'enter' at any of the Change file lines will list the relevant queue before reprompting.

Type '-none-' (or in fact just '-') to set an optional field to '-none-'.

Note that any reference to 'jobs' should be ignored in this version - I do.

Looking at any of the files :

** Look at file : rem727 : Options are :
	More	 : M or enter	- This does a 'cat -v' before to show ControlChr
	Dump	 : D		- Essentially an 'od -ab'
	Edit	 : E		- Using 'vi'
	Tail	 : T		- Last 20 lines of a 'cat -v'
	Quit	 : Q : 
Running a Test :

- A simple format with no xchgs -

** Running please wait ... 

	/fip/bin/ipformat -i /fip/spool/hoswrk/rem727 -p ATEXHORSES -o /fip/xFORM.XX.DEFAULT.XX -s -D -l -xo 

** Ok Done	 Hit Return to continue ..

This will now go directly into the Looking at Output file menu.

- A more complicated run with an xchg before and after -

** Running please wait ... 
	Saving the input in /fip/x/FORM_rem727
	ipxchg -1 /fip/x/FORM_rem727 -D PA2FORM -o /fip/x -F 
	** Ok : Xchg Before finished 
	/fip/bin/ipformat -i /fip/x/FORM_rem727 -p RESTEST -o /fip/x/FORM.XX.DEFAULT.XX -s -D -l -xo 
	** Ok : Format finished 
	ipxchg -1 /fip/x/FORM.XX.DEFAULT.XX -D PA2ATEX -o /fip/x -F 
	** Ok : Xchg After finished 
  Hit Return to continue ..

Again this will now go directly into the Looking at Output file menu.

Note that 'form' allows you to save the setting you have chosen in a Settings file, so that the next time you go into form it should display the settings from the last time.

Sending a Live Test :

This will copy the input test file and send it to the Destination (DU) called the 'Live DU' above.

It then tails the Item Log of that system underthe assumption that 'ipformat' will give a message when the file is through. - Cntrl C to Stop and return to the main Form Test prompt.

Prerequisties for sending a Live test (if that is possible) :

In form, you need to specify :

  • - an input file
  • - a working queue
  • - a destination (DU) which MUST be in the USERS file.
  • - Also remember to check your SC/DCs for your xchg's.
  • - Also remember to add the selection lines in tables/form/PROCESS.


USING THE WEB BASED VERSION OF 'FORM' TO TEST AND TUNE

The web interface also allows you to run offline tests of your dataformat. It is similar to form, but gives you slightly nicer views of the file (for instance hidden characters, high characters and line endings are all high-lit in red.)

i)*You will need to have the form option enabled in your W4 logon* This is normally done by adding the line

;;;;; Data Formatting
options:Data Formatting:/fip-pages/form/dftest.html:_blank

to either your W4 logon, or, more usually, to a template called by your W4 logon.

ii) This should be a fairly intuitive interface (feedback welcomed though)

You first select or create a test ..... a test is a definition of how you are planning to test a file, so it will define the input file, any xchgs you plan to run against it, and the format you plan to run.

 

The top section of the left hand pane will describe the parameters you choose, and the links in the bottom selection can be used to select a sequence of xchgs, sorts and formats.

 

Having selected the parameter files to run


COPY FLOW

How do you route copy into, and out of, the data formatting module ?

- Using the normal FIP routings !

An simple example is the Horse Racing Cards which arrive via dialup modem from the course administrators :

StageProgramInput QueueParameter tables and Remarks
InputVWIRE-wire/RM
RoutingIPROUTE2broutedroute/RM
Actual routing line is :
1		 z="*HORSE RACING*"		+horses 
ie make an 2nd copy of the file and send to destination 'horses'

Dist'nIPWHEEL2gosys/USERS
horses= DP:localhost DQ:form SC:NO DC:NO
ie process on whichever system it arrived on, send the file directly to spool/form with NO chr translation (ipxchg).

Format SelectionIPFORMDformform/PROCESS
Selection of actual Format prarameter file
; Selection criteria for Horses
DU=horses		 >atexhorses
ie if the DU field is exactly equal to 'horses' do the 'atexhorses' job.

Actual ProcessingIPFORMAT-form/text/ATEXHORSES
The original file is deleted at the end while the new, formatted file is sent back to ipwheel with a new destination (DU) of 'atexhorses'.

Dist'n of outputIPWHEEL2gosys/USERS
atexhorses= DP:atex1 DQ:junk-wir SC:HORSES DC:ATEX DF:DATAFORM
ie send file to 2atex queue via 'xchg'.

Chr XchgIPXCHGxchgxchg/HORSES2ATEX
Clean up the data

Send to AtexIPGTWY2atexgateway/DATAFORM
Send it to junk-wir

EXAMPLE 2 : STOCKS : More complicated examples are for the various Stocks.

The large Hong Kong tables follow the same path as 'horses' except that the names of the parameter files are obviously different.


Selection in PROCESS is on Filename only
Format text fileform/text/HKSTOCKS
output xchgxchg/HKSTOCKS

For the Regional Stocks, there are two different input formats (plus Manilla which is different but simpler) which need to be used to create almost the same output.

The Input variations are :


First type :
RICDISPLAY NAMELASTTODAY'S HIGHTODAY'S LOWHISTORIC CLOSE
AAH.ASABN-AMRO HLDGS59.96059.359.5
ACHN.ASACF HOLDING35.735.735.535.5
AEGN.ASAEGON NV112.411312.2112.8

Second Type :
*FASTCLOSERIC9503011900KLS
SECURITYDATEHIGHLOWLAST TRADEPREVIOUS CLOSE
AYER HITAM TIN STK9503014.1003.9604.1003.960
AYER HITAM PLANT STK95030114.30014.30014.30013.500
ACIDCHEM STK9503016.3506.1006.3506.000

Manilla Stocks :
NAMECLOSEHIGHLOWPRECLOSE
A Soriano3.23.23.153.2
A Soriano B3.23.23.23.15

There are two outputs which are identical EXCEPT London and New York prices are quoted in fractions NOT decimals and so they require a different Character Xchg.

The processing is more complicated than for 'atexhorses' as some of the names of the Stocks are changed AND the output is sorted alphabetically. For example 'INTL BUS MACHINE' needs to be 'IBM', but as 'IBM' it will start the 'I's and NOT appear after 'Inland Steel' as in the input feed.

So we use the 'job:' keyword in the PROCESS file to get IPFORMD to run a series of jobs rather than just start IPFORMAT as in the example above.

Look at the processing for London and New York : While the selection remains similar to 'atexhorses' :

SN=LON.TXT	>fracstocks	; London
we select on the filename this time.

But At the top of the PROCESS file there are a series of parameters for the job 'fracstocks' :

;
; Job sequence for London and New York - Fractions
job:fracstocks  /bin/rm -f formsave/FRACSTOCKS*
job:fracstocks  /fip/bin/ipformat -p fracstocks -i $i -D -S FRACSTOCKS
job:fracstocks  /fip/bin/ipxchg -1 formsave/FRACSTOCKS -D fracstocks -F -o formsave
job:fracstocks  /bin/sort +0 -3 -o formsave/FRACSTOCKS.s formsave/FRACSTOCKS
job:fracstocks  /bin/mv formsave/FRACSTOCKS.s 2go/#SN:\SN#DU:atexstocks

So the copy flow for London Stocks is that the FORMAT stage is replaced by ALL these job lines in sequential order :

  • - Remove any files starting FRACSTOCKS in formsave
  • - Run IPFORMAT with the input file and parameter file FRACSTOCKS leaving the output in formsave/FRACSTOCKS
  • - Run IPXCHG once on formsave/FRACSTOCKS overwriting the input with the xchged file. This is to get the correct StockNames? eg:
    x/Hong Kong Telecm/HK Telecom
  • - Sort formssave/FRACSTOCKS on the first three words, creating an output file called formsave/FRACSTOCKS.s
  • - Move formsave/FRACSTOCKS.s to 2go with destination (DU) 'atexstocks' and preserving the original filename (SN)

Please refer to the documentation on IPFORMD in the programs section for more information on jobs.

One further not on the Stocks is that all the output files pass through the STOCKS2ATEX? xchg.

This is used to add column headers before certain Stock names. eg :

x/Northern Elec/\n{M1Stock\rClose\rHigh\rLow\rPrev\r\n{M0Northern Elec

4. PARAMETER FILE REFERENCE GUIDE

This is the reference and hints section describing the main Parameter file used for processing.

Overview

Part 1. Define what the input file looks like plus a general section covering fixed information.

filtyp: recsep: reckey: recsiz: fldsep: fldkey: fldsiz:
stripeol: number: startkey: keycasesens: wild: wchr: set:
include: calc: fraction: base: date: style: partial:
match: hdr: nohdr: name: chrset: before: after:

Part 2. Output Section


Overview

Record Processing Lines

This is flagged as beginning with the 'output:' keyword. It describes what processing should be done for each input record.output

 r=X
lines to describe the output.

Record lines can have system variables, input fields, tests, builtin formatting etc.

Each one of the keywords is described below.

Each line is a self-contained item ending with a NewLine (Unix) or CR LF (PC) or CR (Mac). The text parameter file can be edited by any word-processor on any (normal) platform AS LONG AS the end result is a pure, raw ascii file with no Presentation or fancy graphics embedded.

Comments are the usual semi-colon in front.

	; comment

Reserved Names

The list of keywords is the list provided above plus a series of tests and builtins described below.

Note that it is possible - but not advised - to override some of these keywords. So these names should be considered reserved. In addition a few other names have been reserved for future use :

  • blksep:
  • blkkey:
  • blksiz:

The Structure of the Parameter File

The text file is split into 2 main parts as described above. The OUTPUT section must the second and is marked by the keyword 'output:' on a line on its own.

By common consent, the first part starts with the definitions of the file, records and fields but this is NOT strictly necessary. The advice is - do whatever is easiest for you.

Comments and the binary version of the Parameter file

Comments are lines STARTING with a semi-colon. You can have millions of comment lines and, except for the first run, they will have no effect on run-time speeds.

This is because 'ipformat' uses a compact, binary version of the Text Parameter file which is built automatically by 'ipformbl' every time you modify the Text version.

The only time you need touch the binary versions (in tables/form/bin) is that they should be deleted during every software upgrade of the DF module.

Comments are encouraged !

The Processing Loop is...

The actual processing cycle is :

For each input file hitting queue spool/form :

  • 'ipformd' will select the correct Parameter file using form/PROCESS
  • Normally this will mean starting 'ipformat' with the chosen file

Once started by 'ipformd', 'ipformat' will go through the following steps :

  • - Preprocessing
  • - Create a Fip style header unless not required with 'nohdr'
  • This can be the standard one or can have extra fields added using 'hdr'
  • - Add Data at the beginning of the output file if the 'before' keyword has been specified.

  • - Processing input file
  • - Split the input file into records and for each record :
  • - Start at the 'output' section of the Parameter file
  • - Check each record specification line. If it is for that record type, process it
  • - Loop around for the next record
  • - Postprocessing
  • - Add Data at the end of the output file after the processed data if the 'after' keyword has been specified.
  • - Create a Fip filename
  • This can be the standard one or can be replaced if the 'name' keyword is specified.

- Send the file spool/2go for 'ipwheel' to distribute usually via 'ipxchg'.

That's it !


Syntax of Each Keyword

    Syntax for 'filtyp'

    Syntax:

    	filtyp: (type).
    Where type is
    text- Ordinary text file with each record having a defined separator.
    fixed- fixed record sizes
    variable- variable record sizes

    If the filtyp:f or v, you will need to specify the size (or for variable maximum size) of each record.

    For most applications, filtyp:text means you have to also define the record separator, 'recsep' too.

    Syntax for 'recsep' and 'fldsep'

    Syntax:

    	recsep: (FipSeq string)
    	fldsep: (FipSeq string)
    eg:		recsep:\036

    Normally the separator will be \n or \r\n for NewLine or Carriage NewLine.

    Note that if you just put '\n', 'ipformat' will automatically take any combination of CR and NL.

    Syntax for 'reckey' and 'fldkey' - Define the record or field key

    This defines the type and size of either the record or the field key.

    Normally keys are positioned at the beginning of the record/field but optionally these can be at the end or at an offset from the beginning or end.

    Syntax:

    	reckey: (length) : (type) : (posn) : (EndChr) : (delY/N)
    	fldkey: (length) : (type) : (posn) : (EndChr) : (delY/N)

    where

    • length or size of key - This can be 0 for any length
    • type (optional) can be
    • a-alphabetic
    • u-uppercase
    • l-lowercase
    • n-number
    • p-printable
    • s-space(or tab, CR, NL or FF)
    • b-binary (ie anything)
    • x-alphanumeric
    • c-control (ie < 040 or >= 0177),
    • z-anpa hdr field (ie alnum plus non-quad/format punct.
    • t-punctuation
    • posn (optional) is the offset on the key from the start of the zone If negative count from the end of the zone
    • endchr (optional) is a single chr which terminates the key

    The separator can be any punctuation chr as long as the same chr is used for each field. eg the following two are equal :

    	fldkey  3,n,,|
    	fldkey  3|n||\174
    Ie the end Chr is a pipe but as that is used as a separator, use the octal value

    What is a key ? The key is used really as a TYPE of PROCESSING flag for the output section.

    It can be a unique record key - such as a stock code - but if you have several thousand, it is going to be unwieldy specifying all of them.

    So generally we are trying to classify records into general types. For example of a text file containing schools results like :

    	School	Pinky High
    	Head	James Pinky
    	Pupil	Ramsay		Macdonald	31.6
    	Pupil	Thatcher	Margret		77.3
    	Pupil	U-Dones		Helen		99.3
    We can use the first field, which is alpha and variable length followed by a space.
    	reckey:0:a
    This can be signalled in the output section as :
    	r="school"	(do the school bit)
    	r="head"	(do the head bit)
    	r="pupil"	(do the pupil bit)

    Syntax for 'recsiz', 'fldsiz' - Define the length of fixed size records & fields

    Syntax :

    	recsiz: (length)
    	fldsiz: (length)

    where length is the size of key

    Syntax for 'keycasesens' - Field and Record keys can be upper AND lower

    Normally all keys - record or field - are considered to be case insensitive. So a key r = 'aaa' will pick up both AAA and aaa.

    Use this command to force the difference.

    Syntax :

    	keycasesens:yes

    Syntax for 'number' - Change the number system for specifying non-printable chrs

    When a non-printable chr is specified in the form '\012' the 'number' system can be changed to decimal or hex.

    The default number system is octal.

    Syntax:

    		number:dec
    	or	number:hex
    	or	number:oct
    The change takes effect for all lines in the parameter file lower down until changed again by another 'number' keyword (Why you would specify different number systems for different parts, I have no idea).

    So a New Line chr will be

    		\012		octal
    		\010		decimal
    		\0a		hex

    Syntax for 'wild' - Allow wild card strings using a particular chr

    Syntax :

    	wild: (Chr to use to signify a wild string)
    	wild:*
    This allows a wild string to be used when specifying record keys.

    Note there is NO automatic wild string chr - you always have to specify it.

    For example:

    	wild:$
    Allows us to specify in the output section :
    	r=$	"This is done for all records" 

    Syntax for 'wchr' - Allow a wild card character using a particular chr

    Syntax :

    	wchr: (Chr to use to signify a wild character)
    	wchr:?
    This allows a wild character to be used when specifying record keys.

    Note there is NO automatic wild chr - you always have to specify it.

    For example :

    	wchr:?
    Allows us to specify in the output section :
    	r="abc?e"	"This is done for all records abc(something)e" 

    Syntax for 'stripeol' - Do NOT strip multiple blank records

    Syntax :

    	stripeol:no
    Where the 'recsep' is some combination of CR and NL, normally blank lines or multiple occurances of CR and NL are stripped.

    This command is used to turn that option OFF and to treat all lines as valid records, even ones with no data.

    Syntax for 'startkey' - Force the record Key or type of the BEFORE-FIRST record

    Syntax :

    	startkey:yes
    This forces the key BEFORE the first record to be 'x1594' which can be used in the 'ifprv' test - if previous key.

    Syntax for 'set' - Short forms or format names

    Syntax :

    	set	(name)	(any fixed text) 
    	set	pagehdr	Stock<t>Close<tr>High<tr>Low<tr>Prev<qr>#\n
    Set lets us specify easy-to-remember names to reference strings

    Sets can NOT be split over several lines.

    All leading and trailing spaces are stripped. So use wither double quotes to embed or the '\s' escape string. To specify a double quote, use the octal string.

    FipSeq strings are useable but note that 'set' are parsed when the parameter file is chnaged/created so that any Variable data will be of that time and any FIP hdr data meaningless.

    	eg	set	timedate	\$d-\$m-\$y \$h:\$n
    will produce the date and time of when the parameter file was last changed.

    To get run time date/time, specify the same string in the output section record line NOT the 'set'

    The name of the set - timedate in our example - is case-INsensitive. So it may be called as TIMEDATE or TimeDate or any other variation known to man.

    Syntax for 'include' - Include another file of instructions

    Syntax :

    	include	(filename)
    	include	quark.tags
    This will include another text file in tables/form/text with more Data Formatting commands.

    The filename is force UPPERCASE as normal so in the above case the file will be :

    	/fip/tables/form/text/QUARK.TAGS
    Generally include files will have commands such as 'set', 'calc' which are common to a series of Data Formats - like a quark styles for example.

    Note that if you update the include file, ipformat will NOT rebuild its binary so you need to 'touch' all the other text files in form/text to get the new version. Normally of course ipformat realises a change has been made to the main text file and rebuilds its binary automatically. :

    	ipt form/text
    	touch *

  • Syntax for 'calc' - Define a Calculation or Choose output style for a number
  • Syntax :

    	calc:(name)	(The calculation) 
    	calc:percent	100*(c1/c2)
    Define a calculation where cX are used as variables. The 'name' is that used in the 'output' section.

    Calculations can NOT be split over several lines.

    Variables are loaded at run-time using savnum, savbyte, savint, savswint, savlong or savswlong.

    The default precision for a calculation is 2 decimal places. This can be overridden using the syntax :

    	calc:percent:0	100*(c1/c2)
    where the '0' after the name is the number of decimal places in the range 0-6.

    Calc can also be used to change the output format of a number by specifying the precision. If the raw data is in 4 dec places and you only want 2:

    	calc:dec2:2	c22
    and in the output section, load c22 from data - in this case field 3 :
    	savnum=22 f3
    Operators can be :
    • + plus
    • - minus
    • * multiply
    • / divide

    Take care when dividing by zero !

    Use round brackets to denote how the calculation should be worked out. Ie the deepest level is calculated first, and then the calculation is worked from left to right.

    For example :

    	(c1/c2)*(c3/(c5+c4))/100
    will add run through the following order :
    	Step 1	(c5+c4)
    	Step 2	(c1/c2)
    	Step 3	(c3/result of Step 1)
    	Step 4	(result of Step 2 * result of Step 3)
    	Step 5	(result of Step 4 / 100)

  • Syntax for 'fraction' - Define a Fraction
  • Syntax :

    	fraction:(name):(precision) (Output style WITH & WITHOUT fraction)
    where
    • name is the function to be used in the output
    • precision is smallest denomiator - 2, 4, 8, 16, 32, 64, 128 or 256 - default is eighths
    • + is used as separator - which can be any punctuation
    • first style is parse string used if THERE IS A FRACTION and integer
    • + is used as separator
    • second style is parse string used if THERE IS ONLY AN INTEGER and NO FRACTION
    • + is used as separator
      		fraction:star:16	+\ZI \ZD/\ZN+\ZI.0+
      		fraction:stox:32	|\ZI (\ZD-\ZN)|\ZI|
      Fraction takes a field, partial field, saved field or calculation and split into 3 portions which can then be used as the normal FipSeq.
    • ZS is the sign + or -
    • ZI is the integer
    • ZD is the fraction amount
    • ZN is the fraction denominator

    Specify two styles - the first for if there is a non-zero fraction amount and the second if there is none.

    Syntax for 'base' - Define a Base number

    Syntax :

    	base:(name):(precision) (Output style WITH & WITHOUT fraction) 
    Base is exactly the same as fraction except base will NOT attempt to 'reduce' the fraction.
    • ie 2/4 is left as ZD=2, ZN=4 while fraction will force ZD=1 ZN=2

    Syntax for 'date' - Generate a date and/or time from a number

    Syntax :

    	date:(name):(data order)	(Output style)
    	date:mono:dm	 +Last \ZW was \ZD-\ZN-\ZZ+
    where mono is the name which is used in the output section

    The Order of the raw data is important. So we divide it up into a series of 0 or more 2-digit numbers (or space digit) whose order is specified using :

    • d - day -- 1-31
    • m - month -- 1-12
    • y - year -- (last 2 digits only, must be after 1970)
    • c - century -- 19 or 20 only
    • h - hour -- 0-23
    • t - minute -- 0-59
    • x - padding -- ignore 1 or 2 numbers at this position. Only 1 padding is allowed.

    So for the 21st March 2010, the incoming data must be :

    if it is : 21032010data order is : dmcy
    if it is : 210310data order is : dmy
    if it is : 032110data order is : mdy

    Obviously only one format of data can be handled by a single 'date' but you can have as many 'date's as needed.

    Note that spaces, letters and punctuation are stripped and/or used as field delimitors. So the following are all equivalent :

    • 200896
    • 20th 8-August, 96
    • This day 20th August (the 8th month) in the Year 96

    The Output format is defined in normal FipSeq between two deliminators ('+' in our example above, but can be any punctuation except semicolon ';').

    The new data is added to a series of extra FipHdr? fields :

    • ZD - 2 digit day of month
    • ZM - 2 digit month
    • ZY - 2 digit year e.g. 92
    • ZZ - 4 digit year e.g. 1992
    • ZW - Day of week e.g. Monday, Tuesday etc
    • ZS - 3 character day of week e.g. Mon, Tue etc
    • ZN - Full name of month e.g. January, February
    • ZL - 3 character month e.g. Jan, Feb, Mar etc
    • ZJ - Julian day of year
    • ZH - Hour 00-23
    • ZI - Hour 00-12
    • ZT - Minute 00-59
    • ZP - am/pm
    • ZA - Week Number 00-53
    • ZB - Week Number 01-53
    • ZC - Day of the week 0-6

    (see also manual page for 'strftime' for slightly more information)

    Note that actual Day and Month names depend on the LOCALE of your shell/computer.

    The Default output format, if none is specfied, is :

    	+\ZW, \ZD \ZN \ZZ+
    Which for (order dmcy), (data) 16111997 English LOCALE gives
    	Sunday, 16 November 1997
    Note that if any information is NOT supplied, the run time date/time is used.

    Syntax for 'partial' - Further subdivide a record or field

    Syntax :

    	partial:(name)	(type):(length):(startchr):(endchr)
    where
    • type can be
    • a : alphabetic
    • u : upper case alpha
    • l : lower case alpha
    • n : numeric
    • t : punctuation
    • p : printable (generally Spc to~ })
    • s : space ( tab, space, ff, cr, nl, lf)
    • z : anpa type string (alphanumeric plus hyphen)
    • b : binary
    • length can be zero for unlimited.
    • startchr and endchr are optional.

    There can be up to 100 partial fields as required.

    The contents of a Partial field are accessed by specifying pX where X is the sequential no from the start of the partial - ie first is p1, then p2 etc

    Syntax to partial a field in the output section record line is

    	(partial name) (field name)
    and then you can use any partial fields. For example:
    • To pull apart a data in the form : 26th March 1997
      		; comment		26  th		March	1997 
      		partial:pdate	s:0 n:2 a:0 s:0 a:0 s:0 n:4 
      In the output section, assuming a field 3 of record type 99 contains the date, we can out put just the year by :
      		r=99	pdate f3	p7
      will produce "1997"

    Syntax for 'match' - search and replace on a single field

    Syntax :

    	match:(matchname)	/(search string)/(replacement)
    	match:(matchname):c	/(search string)/(replacement)
    where
    • matchname is a unique alphabetic name
    • c (optional) is for case-SENSITIVE searches
    • the delimiters - '/' in the above example - can be any punctuation as long as they are are the same for that match line

    This is a localized search and replace or search and zap function which is applied ONLY to a record, field, partial field or save area.

    Normally the search is case-INSENSITIVE but can be forced so with the ':c' after the matchname.

    No wild chr or wild strings are permissable at present.

    To specify in the output section :

    		(matchname) (zone)
    where
    • matchname is the SAME as specified in the 'match'
    • zone is the record, field, partial or save area.

    Up to thirty matches can be specified for a single zone.

    For example :

    	match:jan	/1/January/ 
    	match:feb	/2/February/ 
    	match:mar	 /3/March/ 
    	etc .. etc 
    	match:dec	/12/December/
    In the output section - let's pretend field 4 of record type 23 contains a number we want to translate to a month :
    	r=23	dec nov oct sep aug jul jun may apr mar feb jan f4 
    Note that dec nov and oct are done first else if 'f4' was "11" then the jan match will replace "11" with "JanJan" !

    'match' complements the character xchg in IPXCHG. However they are slightly different in that 'match' is localised in that you apply it ONLY to a single field whereas IPXCHG works on a whole file (or using flags, to a selected paragraph, line or section of the file).

    Syntax for 'style' - Reformat a data zone

    Syntax :

    	 style: (name)	(a single printf conversion syntax)
    		  style:twonum	 %.02d

    • - Uses printf which is nasty but a standard (of sorts). Do a man printf for fuller information if you need.
    • - Always starts '%'
    • - If the expression does not end with an 's' ('d' for integer for example), then the string in the header field is first converted to that type.
    • - Specify One and ONLY one expression (can not have %s%d%f) - as it takes the first only
    • - do NOT use for fixed data, ONLY the conversion string.
    • - Types are :
    • string : s
    • char : c
    • long : d,i,o,p,u,x,X
    • float : f,e,E,g,G
    • % : print a % !
    • type n is ignored ??

    Examples

    - to trim a string, use a dot: %.5s
    - To pad a string with spaces: %5s
    - To pad a string with spaces (left justified): %-5s
    - To pad a number with leading zeros: %.06d

    Syntax for 'name' - Overwrite the default name of the output file

    Syntax :

    	 name: (Fip Hdr strings)
    Remember that the 'name' is a list of Fip Hdr fields which will probably include the SN field which is the original name.

    The default name on output is :

    	#SN:(original filename)#DU:(name of the paramfile)s#SC:FORM#DF:FORM
    Where DU is forced lowercase.

    The 'hdr' and 'name' keywords have the same syntax and roughly the same use so further information is in under 'hdr'.

    Remember 'name' is done AFTER all processing of the data - it is the last thing done before the file is sent on for further distribution. So information gleaned from the input, perhaps left in Save Areas, can be used in the name. 'hdr' however is done FIRST, BEFORE any data is touched.

    Syntax for 'hdr' - Add extra fields to the FIP header of the output file

    Syntax :

    	 hdr: (Fip Hdr strings)
    Remember that although the specification MUST be kept on a single line, HASH may be used as a delimitor for Header fields.

    Note also that as the output file is a completly NEW file and has no physical connection to the input file, NO Fip Hdr fields are transfered from input to output UNLESS specified in the 'hdr' or 'name'.

    Generally use 'hdr' to preserve fields you need as for 'name' there is a limit to the number of characters - and the type of characters : no meta characters or slashes '/'etc - such as the Source Header for example :

    	hdr:#SH:\SH#SN:\SN#DF:roger
    will transfer the SH and SN fields and force DF to roger.

    As 'hdr' is processed BEFORE the data, no information generated by IPFORMAT during processing is available. However the 'name' keyword is processed AFTER so such data may be added then.


    The default Fip Hdr put on an output file has the following fields :
    SU:form- ie Source is form
    HS:form_0_95-3-9_17:41:40_4_67- for tracking the file
    HT:794792500- date and time !!
    'hdr' supplements but does not replace these.

    Other useful fields can be :

    • DF - output format for ipedsys, ipgtwy, ipout, ipprint etc. This must be in the 'name' parameter as by default it contains 'DF:form' eg :
      	DF:albert
      will pickup tables/print/ALBERT if ipprint is the program sending to the final destination.

    • CX - force the xchg name to be used to the contexts of this field. eg
      	CX:STOCKS
      will override the SC2DC? fields normally used for 'ipxchg'.

    Syntax for 'nohdr' - do NOT add a Fip header to the output file

    Syntax :

    	nohdr:
    This is used when the output file is to be used immediately by a Unix program which does not understand the Fip Hdr - sort for example.

    Syntax for 'chrset' - force the SC or Source Character set

    Syntax :

    	chrset: (name)
    This fills in the SC: Fip Hdr field. The default is FORM.

    Syntax for 'before' - Insert data BEFORE any output from the input file.

    Syntax :

    	before: (Fip Seq strings and Record processing commands)

    Syntax for 'after' - Insert data AFTER ALL output.

    Syntax :

    	after: (Fip Seq strings and Record processing commands)
    Both 'before' and 'after' can have tests, builtins, contents of save areas etc although obviously for 'before', most of these may have nothing in.

In the Output section - Record processing lines

The actual data is processed and output using the Record processing lines.

The Syntax of each line is that the first bit specifies which record type or key the rest of the line applies to :

		r=(key)		(output)
For example :
	r=abc	"Fried fish starts " s1 spc f4 " and the rest ..."
There can be multiple lines for the same record type. The following two lines will give the same result as the one above :
		r=abc	"Fried fish starts " s1 
		r=abc	spc f4 " and the rest ..."
For lines where you want to process for all EXCEPT a particular record type/key, use teh syntax 'r#' :
		r#35	"Nobody wants record 35s !"


How to specify you want to use, format and/or output zones ?
records r3or r="abc" if not numeric
fieldsf99or f=Z if not numeric
partial fieldsp22- always numeric
save zoness1 - always numeric
flagsx199 - always numeric
calculationsc4- always numeric
countersz4- always numeric
blocksb77- always numeric
set name---specify name as in the 'set'
fixed text---" some fixed text "

A '*' can be used in certain cases to signify 'ALL' zones ie
clrflag=*clear all flags.
f*output all fields from this record.

Use double quotes for alphabetic keys and those with embedded spaces.

You should try not to use 'sets', 'partial's or 'match's with names in the form 'z999' where z is one of the single letters above and 999 is a number in the rangle 1-999.

Note that blocks are 'super records' but should be ignored for now.

Note that case is IGNORED in keys in the current version .

When 'ipformat' finds a name in the record processing line, it does the following sequence :

  • - check to see if it is an already specified constant 'set' name.
  • - if not, is it a zone - record, field, partial or block (eg p44, f3)
  • - if not, is it a builtin command (see below)
  • - if not, is it a save zone, flag or counter (eg s1, x33, c7)
  • - otherwise it is considered some FipSeq string and saved as such.

Spaces, End-Of-Lines and Double Quotes in the output

One common failing when putting together a new parameter file is to completely forget about spaces (or other separators) and end-of-lines (CR or NL or CR NL or whatever) in the output file.

The point is - you have to specify them as NOTHING is implicit in the output file. There is no hidden magic which suddenly realises that you want an end-of-line when you need it. You have to state where and when you want them.

Generally this will be done by either putting them as constant/'set's or specifying them in the record processing line. The following are exactly the same : Either

	set	spc	\s
	set	ql	\n
	output:
	r="BIG"		f5 spc f3 spc f99 spc f5 ql
Or
	output:
	r="BIG"		f5 \s f3 " " p99 \s f5 \n
As you can specify a space as either '\s' or in double quotes, to output a double quote character, you need to specify it as an number : \000.

Builtins

There are a number of builtin conversion routines for formatting zones - records, fields, save areas etc.

These are called by placing the name of the conversion BEFORE the name of the zone eg :

		zapspcextra p5
which means :
  • 'zap all leading, trailing and multiple spaces from partial field 5'
A single zone can be subject to several builtins :
zappunc zapspc caps f=Z
which means :
  • take field "Z" and zap all punctuation and zap all spaces and force uppercase before outputting.

Builtins for case conversion :
caps force zone uppercase
lwrcase force zone lowercase
idicase force zone idiot upper and lowercase
upper1 force first letter of every word uppercase
initial only display first letter of each word followed by a full stop
Builtins for removing spaces:
zapspc remove all spaces from zone
zapspcextra remove all leading, trailing and multiple spaces from zone
zapspclead remove all leading spaces from zone
zapspctrail remove all trailing spaces from zone
Builtins for removing punctuation:
zappunc remove all punctuation from zone
zappuncextra remove all leading, trailing and multiple punctuation from zone
zappunclead remove all leading punctuation from zone
zappunctrail remove all trailing punctuation from zone
Builtins for Counters:
setctr set a counter
incctr add one to a counter
decctr subtract one from a counter
clrctr clear a counter or set it to zero
Builtins for Calculations:
savnum save a printable number in a variable
savbyte save a single byte in a variable
savint save a binary integer (2 bytes) in a variable
savswint save a binary integer (2 bytes swapped) in a variable
savlong save a binary long (4 bytes) in a variable
savswlong save a binary long (4 bytes swapped) in a variable
Miscellaneous:
strlen returns the length of the string which can be output or saved or tested
zapleadzero removes leading zeros from zone
zapctl remove all control characters from zone
incfile include standing file at this point
	r=99	incfile /home/standing/ s4
newfile finish this file, send it and start another
	r=abc	newfile
if any more information is specified AFTER the 'newfile' on the record processing line, it will be added to the FIP Hdr unless 'nohdr' has been specified. eg:
	r=abd	newfile	#DF:newform#QQ:\$Z
log log message in the Item Log
continue - ignore all other tests for this record and continue with the next data record
stop! - stop processing now. If there is an 'after' section it is done before the program finishes. (please note the exclamation mark !)
reckey - output the actual record key. This is useful where wild cards are used for all records but you still need to output what the key was.

Tests

There is a further selection of tests which can be made one zones inside the date.

These enable you to select even finer some processing depending on actual data. If and ONLY if the test is true is the rest of the line continued with.

Syntax for Tests

	(ifxxx) (first string) (second string if required) 
where strings can be fields, partials, saves or fixed text

Actual tests can be :
ifprv/ifnprv - test previous record type/key or not
ifeq/ifne - test if 2 zones are equal or not
ifgt/iflt - test if a zone is greater than another or not
ifflag/ifnflag - test if a flag is ON or OFF
ifnul/ifnnul - test if a zone is empty or not
ifspc/ifnspc - test if a zone only contains spaces or not
ifalpha - test if a zone only contains letters a-Z or not
ifnum - test if a zone only contains number/digits 0-9 or not
ifcon/ifncon - test if a string is (not) found within another
ifpunct - test if a zone only contains punctuation or not

Note that sequence is important for comparing two fields that may be different lengths as ifeq will be true if the first field is complete ie :

		1st=AAA		2nd=AAABC	will be true 
		1st=AAABC	2nd=AAA		will be false 
Example 1 :
	r=24	ifprv r=35	"Last record was type 35 and this is 24"
Only if the previous record type was "35" will the string be output

Example 2 :

	r=24	 f3 ifnul f3 " _ " x99
For record type 24, output field 3 and if there was nothing in it, output a (spc) (dash) (spc). Flag 99 will also be set if there was nothing there.

Example 3 : When using numeric data, please ensure that all extraneous characters are stripped from the zone before the test. In particular strip commas, plus signs, currency symbols etc. For example, if field 7 has data like p9300.0007 and save field 9 has 10,000 compare the two by :

	match:mnop	/p//
	match:mnocomma	/,//
	output:
	r=99	ifgt mnop mnocomma f7 mnop mnocomma s9		"Field 7 >  Save 9"

Using Flags

Flags are a really useful means for deciding type of processing to do - or NOT to do.

Commands for setting, clearing and testing flags are
To set a flag x999 where 999 is the flag number
To clear a flag clrflag=999
To clear all flags clrflag=*
To test a flag is ON ifflag x3 (rest of the commands on line are done ONLY if true)
To test a flag is OFF ifnflag x5 (rest of the commands on line are done ONLY if false)

For example, let's use flag 3 to test if record type 'abc' has Richard, Helen or George in the first field. Print out 'New name is (name) (newline)' if it does :

	r=abc	clrflag=3 
	r=abc	ifeq "Richard" f1	x3 
	r=abc	ifeq "Helen" f1		x3 
	r=abc	ifeq "George" f1	x3 
	r=abc	ifflag x3		"New name is " f1 nl 

Save areas

Save areas may be used to store strings - either in their original state or after conversion/formatting by other built-ins. The maximum save number is 299.

Commands for setting, clearing and testing save areas are :
- To output a save area: s299 -- where 299 is the save number
- To clear a save area: clrsave=299
- To clear all saves: clrsave=*
- To save data in a save area: save=299 (string)
	eg	save=1	f3 
		save=5	caps f7
save the contents of field 7 in save area 5 AFTER forcing to Uppercase
- To append data to a save area: savcat=88 (string)
	eg	save=77 "ABC"
save zone 77 holds ABC
		savcat=77 "DEF"
save zone 77 now holds ABCDEF

Save areas may be used in the normal 'if' tests, eg :

	ifeq "AAA" s1	x88 
if the contents of save area 1 starts "AAA., set flag 88 ON

Using Counters

Counters are integers (ie proper numbers with no decimals or fractions in the range -32000 to +32000.

They are signalled by 'zX' where X is a number.

They can be used to count the number of occurences of a record or field or even types of data and act accordingly.

All counters are set to zero when the program starts and by using the builtins :

  • incctr
  • decctr
  • setctr
  • clrctr
in the Record processing lines, you can manipulate them.

For example, to add some random markup every 10th line of a record type AB using counter 26 :

	r="AB"	incctr=26	ifeq 10 z26	clrctr=26	"[pt9][font99]"
ie : For all records type AB, add 1 to counter 26, then test if ctr 26 is equal to 10; if so reset ctr 26 back to zero and output string '[pt9][font99]'.


The syntax for 'setctr' is
setctr=99 345- set ctr 99 to a fixed number 345
setctr=297 p3- set ctr 297 to the contents of partial field 3.

In the second example, if the p3 is NOT a number, ctr 297 is set to zero. Also if p3 is a decimal number like '123.456', only the main number is saved.

Using Calculations

Calculations are defined in the first part of the parameter file and used in the record processing part :

For example :

	calc:mktcap	c1*c2
	output:
	r=BC	savnum=1 f5	savnum=2 f7	mktcap
In this example we define 'mktcap' to be variables 1 and 2 multiplied together. Then in the output section, for record type BC. field 5 is saved in variable 1 and field 7 in variable 2 before we do the calculation and output the result.

A quick word about BINARY numbers.

Normally fields will hold printable data - such as in the example above - and we use the builtin 'savnum' to take that number for use in the calculation(s).

However some data is already in a binary form. Use builtins 'savbyte, savint, savswint, savlong and savswlong' to load these numbers. Often these will be derived from a partial field using the 'b' for binary field type. eg:

	partial:bindata	b:2 b:4 b:2 b:4
What is a swapped integer or long ? Some computers - like the PDP-11 and most Intel 16+ bit chips - hold the data in reverse byte mode.

- So if the data has been generated on a SPARC OR rs6000 or a Mac the data is 'normal' - use savint or savlong.

- While data from PDP-11s or Intel based PCs could well need to be swapped.

Loading Variables :

  • Save a printable zone as a number variable - use 'savnum'
    savnum=5 p4
    - save the contents of p4 as a number. So if p4 held the string '789', c5 would be the number '789.
  • Save a fixed number in a number variable - use 'savnum' again
    savnum=7 1234
    - loads the number '1234' into c7.
  • Save the contents of a single byte - use 'savbyte'
    savbyte=33	p7

Note that the contents of the variables, c1, c2 etc are not amended by the calculation UNLESS you specifically save it, eg :

	r=BC	savnum=1 f5	savnum=2 f7	savnum=3 mktcap
will load c3 with the result of the 'mktcap' calculation.

Examples of Builtins :
STRLEN

	; test the field 2 is greater than 44 chrs (ie 44 is less than strlen of f2)
	r=HH	 ifnnul f2		 iflt 44 strlen f2		 "Big Field 2 here over 44 chrs long" \n
	r=KK	 "Save Field for Name (s55) is " strlen s55 " chrs long"
ZAPLEADZERO
	; data - field 99 is 00000330303, field 101 is 00000000.00
	r=3	  "This outputs 330303=" zapleadzero f99 ", while this is
0.00=" zapleadzero f101

Putting it all together - some examples

EXAMPLE ONE

; file is variable text type
filtyp:t
; each record is separated by CR NL 2 letter type
recsep  \r\n
; There are NO fldsep - we will use partials
; There are NO reckey or fldkey - we will test strings for the type of processing
; allow wild cards
wild:*
; 
set	  qc		\004\n
set	  topbit  \n{M2Processing Date :
set	dash	" _ "

;Partial a Class line which contains the Class/Name/Length of race
; eg : Class 2 - ATV Anniversary Hcp. - 1000 M
partial:pclass  p:0::\s s:0 n:0  s:0 t:1 p:0::- t:1 p:0

; localised matchs - search and replace
match:mhcp	/(Hcp.)//
match:mhcp2	/Hcp.//
; replace M with meters
match:mmeters	?M?meters?
;
;******************** output section ***********************8
output:
; Start by clearing flags 99 and 1 for each input record...
r=*	clrflag=99 clrflag=1

; Now test for ONLY those lines which match our needs...
;| all  |if field1 start|partial field1 |if partial	 |set	 |set flag 99 on
;| recs |with Class	  |according using|field 3 is not|flag 1 |too
;|		|			 |pclass	 |empty	 |one	 |
r=*	  ifeq "Class" f1 pclass f1		 ifnnul p3		 x1		x99

; Print out only the names of a new race - only process if flag 1 is ON
; Use flag x101 to output [rf3] for the FIRST race only - which is the 1st class
r=*	  ifflag x1		 ifnflag x101	 [rf3]	x101
; partial f1 again using pclass, if partial field 6 is NOT empty, remove extra
;	spaces, Do the two search and Replaces and output followed by a 004 NL
r=*	  ifflag x1		 pclass f1		 ifnnul p6 zapspcextra mhcp mhcp2 p6 qc
; remove extra spaces from partial 1 and output it, output partial 2 and 3, then
;	if partial field 8 is NOT empty, add (spc) (dash) (spc) etc
r=*	  ifflag x1		 zapspcextra p1 p2 p3 ifnnul p8 dash mmeters zapspc p8
r=*	  ifflag x1		 qc


FipSeq

Many keywords in the DF module can have variables as well as fixed text for parameters.

These ar generically called FipSeq strings and can be :

		- Normal Ascii printable text : remember that leading and trailing spaces
		are always trimmed so use double quotes to embed :
		"	Some leading spaces and some trailing	 "
		Also in the record specification ALL spaces between fields are
		stripped; again use double quotes to embed or Unix escape chr \s
		- Unix style escape chrs : backslash then lowercase chr :
		Carriage return	CR  : \r
		New Line	NL  : \n
		Space		SPC : \s
		Backspace	BS  : \b
		Tab		TAB : \t
		Backslash		 : \\
		Form feed or Vertical Tab  FF or VT : \f
		Wild chr (if specified) : \w
		Hexadecimal number  :\x99
		CR NL			 : \l
		- Octal numbers : backslash and 3 digits zero padded : \001, \377
		These can be decimal or hex by using the 'number:' keyword.
		- Internal FIP header fields : backslash and 2 uppercase chrs :\SN, \DQ
		to extract fields from the Source Header ( Fip field SH) use \X?
		ie \XP for Priority.
		- System variables :
		\$D : day of month in 99 format
		\$M : month in xxx format
		\$I : month in 99 format
		\$Y : year in 99 format
		\$H : hour (99)
		\$N : min (99)
		\$B : sec (99)
		\$J : julian date (3digits, Jan1 is 001)
		\$S : 3 digit ascending sequence number
		\$Z : 4 digit ascending sequence number
		\$A : atex orig field (SOURCE;06/06,14:35)
		\$C : number of chrs in file
		\$W : number of words in file (IP_WORD_LEN)
		\$R : Random letter
		\$O : end optional text
		\$X : strip trailing spaces of buffer so far

Fip Header fields can be further manipulated using pseudo-fields :
	fixed: QZ		 1234543
	partial:QT		ST,3,2,U,<,>
	combie:QZ		 ep|na,(000000)a
	option:QT		 ep,11,7,s

For fixed fields : 
	fixed: QZ		 1234543 
	ie If QZ is specified, replace with 1234543 
	Syntax  fixed: [newfield]		 [tab/space]	  [fixed text] 
 
For partial fields. An example : 
	partial:QT		ST,3,2,U,<,> 
	ie If QT, take ST header field posn 3 for 2 chrs, UPPERCASE. 
	Syntax  partial: [newfield]	  [tab/space] 
		[existing field] [comma] [startposn] 
		[opt comma length] 
		[opt comma processing] 
		[opt comma start chr] 
		[opt comma end chr] 
	where : Start and Length start from 1 not 0. 
	 Length can be zero or not defined for all characters in the field 
	 Processing is U-uppercase, L-lower, N allow only numbers, P-printables 
	 The Start Chr can be used to start the string. If there is also a 
	 length then this length is FROM the Start Chr. 
	 The End Chr can be used to end the string when it is undefinite length.
 
For combinations : 
	combie:QZ		 ep|na,(0000000)a 
	ie Use EP header field, if not there use NA field, if not use the 
		fixed text '(0000000)a'. 
	Syntax  combie: [newfield]		[tab/space] 
		[existing field1] [|] [existing field2] 
		[opt comma] [opt default fixed text] 
 
For optional fields (used in conjuction with the \$O flag): 
	option:QT		 ep,11,7,s 
	ie If EP header field exists and has a space in the 7th position, 
		send this text else strip text until the \$O flag. 
	Syntax  option: [newfield]		[tab/space] 
		[newfield] [?] [existing field] [comma] [size] 
		[opt comma] [opt posn of test chr] 
		[opt comma] [opt posn to send remainder of fld] 
	where size is minimum size of field. 
	The send parameter will send contents of the field from that position 
	onwards. If not present, the field is used ONLY as a test and NOT 
	to send chrs.  Note that both size and test are start from 1 not 0. 
	A single chr can be tested to be non-space as in the example above. 
	If either the size or the test is FALSE, all text and sebsequent data 
	whether fixed or variable (including more Optionals) is ignored until 
	the EndOpt flag is met - '\$O' (see below). 

Watch out using FipSeq strings in 'set's

Note that 'set' are parsed when the parameter file is chnaged/created so that any Variable data will be of that time and any FIP hdr data meaningless. eg:

	set	timedate	\$d-\$m-\$y \$h:\$n
will produce the date and time of when the parameter file was last changed.

However FipSeq variables specified in record output lines will return run-time data eg :

	r=*	"And now the date and time : \$d-\$m-\$y \$h:\$n"
will produce date and time when that record was processed.

Importance of your LOCALE

Unix allows you to play around with character sets - called Locales - and this can have repercussions for the data formatting module.

These are defined as part of the ENVIRONMENT.

  • look at the man pages for 'setlocale'
  • check you own settings with 'env | more' for LANG, LOCPATH etc.

For any non-English environment, it is important to define :

  • What is a Alphabetic chr ? - normally a-z, A-Z
    • remember all the accented characters
  • What is a control character ? - normally octal 0-037
    • sometimes these can also be octal 200-237.
  • What is punctuation ? - normally ",.!@#$%^&*()-_=+[]{};':"<>
    • if you want to use 'zappunc', make sure.

If you get it wrong, you may find that an accented chr you consider to be alphabetic is processed by 'ipformat' as a binary chr. So take care.

Current Version

Current version limits are

	flags		- 300 allowed in the range 1-300
	counters	- 300 allowed in the range 1-300
	saves		- 300 allowed in the range 1-300
	calculations	- 300 allowed in the range 1-300
	partials	- 100 allowed in the range 1-100
	matches		- 30 allowed for any one field
	record length	- 64k maximum
	fields in a record	- max 100
	there can be up to 1000 'set's and other constants
There is also an internal buffer size for the size of the binary of the parameter file which is 16k - however most binaries are under 1k and the biggest seem so far is about 5k.
	keysize must be less than < 20 chrs
	ipformat misses 1st record if the initial sep is missing
	split keys are not allowed
	all keys are case INSENSITIVE

Save areas must be less than 16K each. In versions from 040, the program should handle many changes, additions etc. However if you do use buffers which are TOO big, an error message to the fact is logged in the Item Log and data MAY be ignored.

Until modified, note that a clrsave=* will reset everything.

POSTSCRIPT DRIVER - ipsetter

Please see Ipsetter


PROGRAM DETAILS

This section covers the following programs :
  • form
  • ipformd
  • ipformat
  • ipformbl
  • ipformch
  • ipfsel
  • ipfprep
  • ipfchk

form

Manual interface to the data formatting package

Allowable commands are : x go into test mode l look at log m look at log - for 'form' items c check crontab - for items about to go - c all for ALL of root's crontab t look at the individual Parameter file in tables/text/text and show the contents p look at main form files in tables/form - PROCESS, SETTER, SETPAGE etc g go auto h help v version q quit

-----------------------------------

ipformd

Please see Ipformd

This is the daemon for data formats.

It uses a parameter file is used to route and process incoming files. This parameter defaults to tables/form/PROCESS.

It first uses a selection table to decide what the job really is. As the list is top down, only the first valid selection is processed.

The 'jobname' found is usually the name of a parameter file in tables/form/text.

IPFORMD will automatically start IPFORMAT with parameters of the input file and the jobname/parameter file.

However, optionally IPFORMD can be used to run a sequence of 'jobs' specified for a particular jobname.

The syntax for the PROCESS file is : ; comment ; the following is a selection line (hdr field) = string [opt (tab) & (hdr) = string ...] (tab) >jobname (nl)

job
(jobname) (program to run)
trace
(jobname)
test
(jobname)

To describe the Selection syntax in detail : (hdr field) = string [opt (tab) & (hdr) = string ...] (tab) >jobname (nl) Each selection is on a single line. If necessary, multiple conditions can be specified with the '&' to 'and' them. The operation equal, '=', can also be NOT equal '!='. Source Header fields (in SH) are preceeded by X, ie XC for category. A '*' is used a wild card string; a '?' is single wild card chr. To search for a string/chr embedded somewhere in a field, uses a '*' before and after. If embedded spaces are needed in the string-to-be-searched, use an '*'. Note that the search string is case_insensitive. Both the selection file and the main file are scanned completely, so that one file may be sent to none, one or several destinations according to the same or different criteriae.

For the 'job' parameter : - '$i' refers to the input file name (Note \$i is still the FIP System Variable 'month') - all queues and files are assumed to be under /fip/spool - Never assume however that the path environment has been setup, so we advise you specify full pathnames for the programs. - all 'job' lines MUST precede the selection - ie be above. - FIP System variables and Header fields can be accessed. - there can be one or many or very many job lines. - any program can be run - if a script/program returns an error, it is logged in the Item log and further processing stops. If a 'job' exists for a jobname, ipformd will NOT run ipformat but will run what is specified - which may be ipformat of course.

The 'trace' parameter is used for setup, tuning and testing a new job. All it does is tell IPFORMD to log each line in the Item log. EG: trace:shares Trace MUST always be specified in the PROCESS file BEFORE the jobs (ie on a line nearer the top of the file) and all jobs must be before the selection lines.

The 'test' parameter does the same as 'trace' BUT NONE of the programs are actually run. This allows you just to check syntax etc. EG: test:racecards Test MUST always be specified in the PROCESS file BEFORE the jobs (ie on a line nearer the top of the file) and all jobs must be before the selection lines.

First examples - simple jobs : ; Sports copy to Tranmere Rovers FC. XC=S* & XK=*Football* >tranmere EP=TAR. >tarmac RD=*Broken_Hill* >bhp

where TRANMERE, TARMAC and BHP are all various parameter files in tables/form/text

Second example of a sequence of jobs. Here, if the file starts 'borsen', IPFORMAT is strted twice (sequencially, serially - ie NOT at the same time) using two parameter files in tables/form/text : WKENDSHARES and DAILYSHARES .

; shares : for both weekday and weekend job:shares /fip/bin/ipformat -p dailyshares -i $i job:shares /fip/bin/ipformat -p wkendshares -i $i

; Selection for job called 'shares' SN=borsen* > shares

Third example shows how to sort, xchg and basically really screw around.

; Reformat, sort and generally destroy the horses ... job:geges /bin/rm -f formsave/NAGS* job:geges /fip/bin/ipformat -p geges -i $i -D -S NAGS job:geges /fip/bin/ipxchg -1 formsave/NAGS -D geges -F -o formsave job:geges /bin/sort +0 -3 -o formsave/NAGS.done formsave/NAGS job:geges /bin/mv formsave/NAGS.done 2go/#SN:\SN#DU:nagsdone

; Selection for job called 'geges' SU=RACEWIRE & SN=horse* > geges

Fourth Example is where the 'test' parameter is added to the 'geges' job in the Third example (Remember the 'test' must be specified on a line at the top of the PROCESS file BEFORE any 'jobs' for the same jobname) : test:geges Going into the MUI, ip, and doing a 'l' to list the log (or 'm' to more) gives: Sat Mar 4 11:52:30 ipformd i : Incoming File : geges : : geges Sat Mar 4 11:52:30 ipformd f : Test/NotRun : /fip/bin/ipformat -p geges -i /fip/spool/form/geges -D -S NAGS Sat Mar 4 11:52:30 ipformd f : Test/NotRun : /fip/bin/ipxchg -1 formsave/NAGS -D geges -F -o formsave Sat Mar 4 11:52:30 ipformd f : Test/NotRun : /bin/sort +0 -3 -o formsave/NAGS.done formsave/NAGS Sat Mar 4 11:52:30 ipformd f : Test/NotRun : /bin/mv formsave/NAGS.done 2go/#SN:geges#DU:nagsdone

Other Points worth noting (ish) .. Break out - If either the input parameter -x or a header field FZBO is present, the input file is 'broken apart' into blocks, records and fields. The resultant file is called (dest)_(SN) in spool/formtest where dest is as above and SN is the filename.

If an incoming file matches none of the tests, it is deleted and an error logged.

In the selection file, remember to specify long names first. In the following example, job 'sunrac2' never gets processed as all files will be jobbed as 'sunrac'

XK:RAC* >sunrac XK:RACING* >sunrac2

Input parameters are (all optional) : -c : name of a queue into which copies of all incoming files are made. default: no copies -f : file creep time default: 0 -i : queue to scan default: spool/format -l : do NOT log every incoming file/destination default: log -n : run the program at reduced priority default: nice 5 -p : processing file to use default: tables/form/PROCESS -s : run files serially (ie one after the other) default: parallel -t : scan time of directory default: 3 secs -T : Always trace jobs. This is the same as the 'trace' parameter used for setup, tuning and testing a new job. All it does is tell IPFORMD to log each line in the Item log. def: no -x : debugging ON - ALL incoming files will be 'broken out' in formtest parameter is 'o'ctal, 'd'ecimal or 'h'ex. default: off -z : calm down time default: 5 secs To attempt to let ipformat finish one job before the next -v : display version number and exit.

ipformat

Ipformat is the key formatting program. It can read and split text, csv and xml files into records and fields and reassemble them using conditions and calculations.

Please see Ipformat

ipformbl

Please see Ipformbl

IPFORMBL takes the text parameter file and builds a binary so that IPFORMAT can run faster.

ipformch

This is the checker for data formats. It is started usually by crontab.

Please see Ipformch

ipfsel

This program is used to Select and sort lines from an incoming data files using a selection file.

A parameter file is used to determine where the files should be sent and other parameters. This file is in tables/from/select and default to SELECTION. This may be overridden by the contents of the Fip Hdr field 'DF'.

Please see Ipfsel

ipfprep

Please see Ipfprep for up to date information

This program prepares incoming data to tweak it before IPFORMAT is run against it.

ipfchk

Please see Ipfchk for up to date information

This program scans a directory and checks the incoming files for CRC errors.

If found the lines are stuffed into an error file and sent to an errdst while the resultant file is flagged with the HE field as ERROR.


-->

© FingerPost Ltd. 1996 and beyond

Examples

Topic revision: r1 - 18 Jul 2008 - 14:53:10 - TWikiGuest
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback