This is a library for working with data matrices, taking off from where library(csv) ends.
The library will hopefully grow to become useful tool for logic programming based data science.
In theory the library supports polymorphic representations of matrices, but in its
current form is best to assume that the canonical form (see mtx/1) is the only one supported.
The library should be considered as still in developmental flux.
License: MIT.
At the very least library(mtx) can be viewed as an addition/enhancement io of matrices to files via mtx/2.
The library can interrogate the data/ subdirectory of all installed packs for csv files using alias data.<br>
?- mtx( data(mtcars), Mtcars ). Mtcars = [row(mpg, cyl, disp, hp, ....
Where mtcars.csv
is in some pack's data directory.
?- mtx_data( mtcars, Mtcars ). Mtx = [row(mpg, cyl, disp, hp, ....
Where mtcars.csv is in pack(mtx)
data subdirectory.
mtx/2 works both as input and output.<br>
If 2nd argument is ground, mtx/2 with output the 2nd argument to the file pointed by the 1st.
Else, the 1st argument is inputed to the 2nd argument in standard form.
?- tmp_file( mtc, TmpF ), mtx( pack('mtx/data/mtcars'), Mtc ), mtx( TmpF, Mtc ). TmpF = '/tmp/pl_mtc_14092_0', Mtc = [row(mpg, cyl,
The first call to mtx/2 above, inputs the test csv mtcars.csv, to Mtc (instantiated to list of rows).
The second call, outputs Mtc to the temporary file TmpF.
mtx/3 provides a couple of options on top of csv_read_file/3 and csv_write_file/3.
sep(Sep)
is short for separator, that also understands comma, tab and space (see mtx_sep/2).match(Match)
is short formatch_arity(Match)
?- mtx( data(mtcars), Mtcars, sep(comma) ). Mtcars = [row(mpg, cyl, disp, hp, ....)|...]
If a predicate definition has both Cnm and Cps define them in that order.
Good starting points are the documentation for mtx/1, mtx/2 and mtx/3.
pack(prolog/mtx.pl)
True iff Mtx is a valid representation of a matrix.
This is a synonym for mtx( Mtx, Canonical )
.
The predicate is mostly present for documentation purposes.
The canonical representation is a list of terms.
Valid representations are (see mtx_type/2).
as possible to be read by csv_read_file/2 alias paths and normal delimited file extension can be ommited
+ Notes for developers.
For examples use:
?- mtx_data( mtcars, Mtcars ). M = [row(mpg, cyl, disp, hp, .... ?- mtx( pack(mtx/data/mtcars), Mtc ). ?- mtx( data(mtcars), Mtx ).
Variable naming conventions
If a predicate definition has both Cnm and Cps define them in that order.
?- mtx_data( mtcars, Cars ), mtx( Cars ).
The canonical representation of a matrix is a list of compounds, the first of which is the header and the rest are the rows. The term name of the compounds is not strict but header is often and by convention either hdr or row and rows are usually term named by row.
When Opts is missing it is set to the empty list (see options/2).
Modes
When +Any is ground and -Canonical is unbound, Any is converted from any of the accepted input formats (see mtx_type/2).
When both +Canonical and +Res are ground, Res is taken to be a file to write on.
Under +Canonical and -Res, Res is bound to Canonical (allows non-output).
This predicate is often called from within mtx pack predicates to translate inputs/outputs to canonical matrices, before and performing the intended operations.
The predicate can be made to look at data directories of packs for input data matrices.
The following three calls are equivalent.
?- mtx( data(mtcars), Mtcars, sep(comma) ). ?- mtx( data(mtcars), Mtcars ). ?- mtx( pack('mtx/data/mtcars.csv'), Mtcars).
Data matrices can be debug-ed via the dims
and length
goals in debug_call/3.<br>
?- debug(mtx_ex). ?- use_module(library(lib)). ?- lib(debug_call). ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, dims, mtcars/Mtcars ). % Dimensions for matrix, (mtcars) nR: 33, nC: 11. Mtcars = [row(mpg, cyl, disp, hp, ....)|...] ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, len, mtcars/Mtcars ). ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, length, mtcars/Mtcars ). % Length for list, mtcars: 33 Mtcars = [row(mpg, cyl, disp, hp, ....)|...]
Opts is a term or list of terms from the following:
mtx(Handle,Mtx)
!=match_arity(Match)
into Wopts and Roptsseparator(SepCode)
into Wopts and Ropts, via mtx_sep(Sep,SepCode)
, mtx_sep/2?- mtx( pack(mtx/data/mtcars), Cars ), length( Cars, Length ). Cars = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, ....], Length = 33. ?- mtx( What, [hdr(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], [output_file(testo)] ). What = testo. ?- shell( 'more testo' ). a,b,c 1,2,3 4,5,6 7,8,9 true. ?- mtx( What, [hdr(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], [input_file('testo.csv'),output_postfix('_demo')] ). What = testo_demo.csv. ?- mtx( pack(mtx/data/mtcars), Cars, cache(cars) ). Cars = [row(mpg, cyl...)|...] ?- debug(mtx(mtx)). ?- mtx( cars, Cars ). Using cached mtx with handle: cars Cars = [row(mpg, cyl...)|...] ?- mtx( pack(mtx/data/mtcars), Mtx, cache(mtcars) ), assert(mc(Mtx)), length( Mtx, Len ). ... Len = 33. ?- mtx( mtcars, Mtcars ), length( Mtcars, Len ). ... Len = 33. ?- mtx( mc, Mc), length( Mc, Len ). ... % Len = 33.
@version 1:0, 2014/9/22
@version 1:1, 2016/11/10, added call to mtx_type/2 and predicated matrices
@tbd options version, with 1. read_options(ReadCsvOpts)
and fill_header(true)
-> with new_header(HeaderArgsList)
(fill_header(replace)
-> replaces header new_header(...)
) new_header(1..n) by default.
?- mtx_data( mtcars, Mt ), mtx_column_kv( Mt, mpg, KVs ). KVs = [21.0-row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), 21.0-row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), 22.8-row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), 21.4-row(21.4, 6...)|...]
has_header(HasH)
in Options.
If HasH is false, Header is a made up row of the shape row(1,...,N)
Opts
The predicate is meant as a companion to mtx_header_body/5.
Here, unlike in the alternative implementation, we first look for Cid in Hdr args if that is successful the corresponding position is returned, only then we check if Cid is integer before returning it as the requested position. We also check Pos in this case is within range. jjj
?- mtx_mtcars( Mt ), Mt = [Hdr|_Rows], mtx_header_column_name_pos( Hdr, mpg, Cnm, Cpos ). Cnm = mpg, Cpos = 1. ?- mtx_mtcars( Mt ), Mt = [Hdr|_Rows], mtx_header_column_name_pos( Hdr, 3, Cnm, Cpos ). Cnm = disp, Cpos = 3.
mtx_header_column_name_pos( Hdr, Cid, _, Pos )
.
?- mtx_mtcars(M), mtx_header(M,H), mtx:mtx_header_column_pos(H,carb,Pos).
Version 0.2 added File.
?- mtx_sort( [row(a,b,c),row(1,2,3),row(7,8,9),row(4,5,6)], b, Ord ). Ord = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)].
basename(CsvF)
.pl exists and no option pl_ignore(true)
is given, then the .pl file
is consulted into Module with no further questions asked of Opts.
A warning message is printed on user_output except if pl_warning(false)
is in Opts.
Opts it should be one, or a list of the following
header(true)
*true csv file has heaader and this is asserted
*false csv file has no heaader and hdr(1,...,n)
is asserted
*void csv file has no header and none is asserted
*ignore csv file has a header but this is ignored (nothing asserted)
pl_ignore(false)
If true predicate does not check for existance of corresponding .pl file.pl_warning(true)
If false the latter case no warning is printed if pre-canned .pl file is loaded "as-is".pl_record(false)
false or _true. If true, record the loaded program to corresponding .pl file.
Any remaining options are passed to csv_read_file/3.Values should be a list of values, or a term of the form:
call(WholeG,AllClmdata)
, where AllClmData is the whole Kth Column (minus header).
Note that for callable K, all columns of Mtx that succeed on the K(Cid) are transformed.
N is taken to be relative to each input and can be an expression except if
of the form abs_pos(Abs)
(see mtx_relative_pos/5).
?- Mtx = [row(a, b, d), row(1, 2, 4), row(5, 6, 8)], assert( mtx(Mtx) ). ?- mtx(Mtx), mtx_column_add( Mtx, 3, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- mtx(Mtx), mtx_column_add( Mtx, -1, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- mtx(Mtx), mtx_column_add( Mtx, d, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- mtx(Mtx), mtx_column_add( Mtx, 3, transform(3,plus(1),plus1), New ). New = [row(a, b, d, plus1), row(1, 2, 4, 5), row(5, 6, 8, 9)]. ?- Mtx = [hdr(a,b,a,c), row(1,2,1,3), row(2,3,2,4)], mtx_column_add( Mtx, +(1), transform(=(a),plus(2),plus2), Out ). Out = [hdr(a, plus2, b, a, plus2, c), row(1, 3, 2, 1, 3, 3), row(2, 4, 3, 2, 4, 4)]. ?- Mtx = [hdr(a,b,a,c), row(1,2,1,3), row(2,3,2,4)], mtx_column_add( Mtx, 1, transform(=(a),plus(2),atom_concat('2+')), Out ). Out = [hdr(a, '2+a', b, a, '2+a', c), row(1, 3, 2, 1, 3, 3), row(2, 4, 3, 2, 4, 4)]. ?- Mtx = [hdr(a, b, c), row(1, 2, 3), row(4,5,6)], mtx_column_add( Mtx, 4, transform([1,2],sum_list,atom_concat('a+b')), Out ). Out = [hdr(a, b, c, ab), row(1, 2, 3, 3), row(4, 5, 6, 9)]. ?- ['/home/nicos/pl/lib/src/meta/aggregate']. ?- Mtx = [r(a,b,c,d),r(x,1,2,3),r(y,4,5,6),r(z,7,8,9)], mtx_column_add( Mtx, 5, derive(aggregate(plus(),0,indices([3,2,4])),1,3,sum), Otx ). Otx = [r(a, b, c, d, sum), r(x, 1, 2, 3, 6), r(y, 4, 5, 6, 15), r(z, 7, 8, 9, 24)].
When Cid is an unbound all possible values are erumerated, whic Cid = Cname.
?- mtx_mtcars(Mtc), mtx_column( Mtc, carb, Carbs ). Carbs = [4.0, 4.0, 1.0, 1.0, 2.0, 1.0, 4.0, 2.0, 2.0|...].
Since v.0.2 supports memory csvs.
Since v.0.3 supports Order. Previously Order = true was assumed which remains the default for back compatibility
% fixme: use the cars csv from pac()
?- mtx_read_file( 'example.csv', Ex )
, mtx_columns( Ex, [c,b], ABs )
.
Ex = [row(a, b, c)
, row(1, 2, 3)
, row(4, 5, 6)
, row(7, 8, 9)
],
ABs = [row(2, 3)
, row(5, 6)
, row(8, 9)
].
% fixme: use the cars csv from pac()
?- mtx_read_file( 'example.csv', Ex )
, mtx_columns( Ex, [c,b], false, ABs )
.
Ex = [row(a, b, c)
, row(1, 2, 3)
, row(4, 5, 6)
, row(7, 8, 9)
],
ABs = [row(3, 2)
, row(6, 5)
, row(9, 8)
].
?- mtx_data( mtcars, Mt ), mtx_column_default( Mt, mpg, true, Mpg ). Mt =..., Mpg = [21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8|...]. ?- mtx_data( mtcars, Mt ), mtx_column( Mt, typo, NaL ). ERROR: Unhandled exception: could_not_locate_column_in_header_row(typo,row(mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb)) ?- G = ( Mpg=[] ), mtx_data( mtcars, Mt ), mtx_column_default( Mt, typo, G, Mpg ). G = ([]=[]), Mpg = [], Mt = ... .
cnm_StdCnm(Cnm)
is in Opts.
Def is propagated as the 3rd argument to mtx_column_default/3,
except when it is an atomic different to true and false.
In the latter case, a ball is prepared which includes Def in its arguments
with the intution that in that case Def is an atom identifying the
matrix or its source, to the user.
?- Mtx = [r(a,sec,c),r(1,2,3),r(4,5,6)], assert( m(Mtx) ). ?- m(Mtx), mtx_column_name_options( Mtx, b, example, Column, [] ). ERROR: Unhandled exception: matrix_required_column_missing(example,b) ?- m(Mtx), mtx_column_name_options( Mtx, b, false, Column, [] ). false. ?- m(Mtx), mtx_column_name_options( Mtx, b, example, Column, [cnm_b(sec)] ). Mtx = [r(a, sec, c), r(1, 2, 3), r(4, 5, 6)], Column = [2, 5].
Opts
cnm_from(From=from) | from |
cnm_to(To=to) | to |
cnm_weight(Weight=weight) | weight |
cnm_StdCnm(Cnm)
is in Opts and Cnm is ground.Mtx and Out can be either files or read in rows: see mtx/3. Opts are passed to the two calls.
Opts
?- assert( mtx1([row(a,b,c),row(1,2,3),row(4,5,6)]) ). ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(2), Rows ). Rows = [row(a, b, c), row(1, 2, 3)]. ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(4), Rows ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Rows = [row(a, b, c)]. ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(2), Rows, excludes(Exc) ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Rows = [row(a, b, c), row(1, 2, 3)], Exc = [row(4, 5, 6)].
header(s)
or number(s)
of) from Mtx to produce Sel with remainder Rem.
Sel is the removed column(s)
, and Rem is the remainder of Mtx.
Rem is a matrix whereas Sel is a list of values if ColumnS was atomic or a list of list values if
ColumnS was a list.
When CallStr is of the form @(Goal) or call(Goal)
, it will be applied to each column, with
succeeding columns Selected for Sel.
(Note that dealing with presence/absence of column name is delegated to Goal).
Goal is called in user if it is not module prepended (see mod_goal/4).
?- Mtx = [row(a,b,c,d),row(1,1,1,1),row(1,1,2,3),row(2,2,2,2)], assert( ex_mtx(Mtx) ). ?- ex_mtx(Mtx), mtx_column_select( Mtx, b, Red, Sel ). Mtx, = [row(a,b,c,d),row(1,1,1,1),row(1,1,2,3),row(2,2,2,2)], ?- mtx_column_select( Mtx, [a,b], Red, Sel ). Red = [row(c, d), row(1, 1), row(2, 3), row(2, 2)], Sel = [[a, b], [1, 1], [1, 1], [2, 2]]. ?- assert( ( has_at_least(Tms,Val,List) :- findall( 1, member(Val,List), Ones ), sum_list(Ones,Sum), Tms =< Sum) ). ?- has_at_least(2,a,[a,b,c,a] ). true. ?- has_at_least(2,b,[a,b,c,a] ). false. ?- ex_mtx(Mtx), mtx_column_select( Mtx, call(has_at_least(2,1)), Red, Sel ). Mtx = [row(a, b, c, d), row(1, 1, 1, 1), row(1, 1, 2, 3), row(2, 2, 2, 2)], Red = [row(c, d), row(1, 1), row(2, 3), row(2, 2)], Sel = [[a, b], [1, 1], [1, 1], [2, 2]].
The predicate assumes Csv is of the form [Hdr|Rows] and includes Hdr to result. If you want to call on non headers Rows then with numeric NumClm you can call:
?- mtx_column_threshold( [_|Rows], NumClm, Val, Dir, [_|OutRows] ).
Exaamples
?- assert( csv([row(a,b,c),row(1,2,3),row(1,4,5),row(3,6,7),row('',8,9),row(3,b,10)]) ). ?- csv( Csv ), mtx_column_threshold( Csv, a, 2, <, Out ). Out = [row(a, b, c), row(1, 2, 3), row(1, 4, 5)]. ?- csv( Csv ), mtx_column_threshold( Csv, 1, 2, >, Out ). Out = [row(a, b, c), row(3, 6, 7), row(3, b, 10)].
Header is assumed.
Op should be a recognisable operator, see stoics_lib: op_compare/).
The predicate will call op_compare( Op, Freq, Thresh )
, for the Frequency
of every distinct value on column Cid in Mtx.
?- assert( a_mtx([r(a,b,c),r(1,2,1),r(1,2,1),r(1,6,7),r(8,9,10)]) ). ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, >, 2, Red ). Red = [r(a, b, c), r(1, 2, 1), r(1, 2, 1), r(1, 6, 7)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, <, 2, Red ). Red = [r(a, b, c), r(8, 9, 10)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, <, 1, Red ). Red = [r(a, b, c)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, =<, 1, Red ). Red = [r(a, b, c), r(8, 9, 10)].
NewClmName(ClmName,New)
where New is used as the new column name.?- assert( (plus_one(A,B):-B is A + 1) ). % plus/3 only works on integers... ?- mtx( pack('mtx/data/mtcars'), Mtx, cache(mtcars) ), mtx_column_replace( Mtx, mpg, mpgp1, @(user:plus_one()), _, New ). Mtx = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(21.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(18.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(18.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(14.3, 8.0, 360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...], New = [row(mpgp1, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(23.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(22.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(19.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(19.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(15.3, 8.0, 360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]. ?- assert( (psfx_one(Name,Psfxed) :- atomic_list_concat([Name,one],'_',Psfxed)) ). ?- mtx_column_replace( mtcars, mpg, user:psfx_one(), @(user:plus_one()), _, New ). New = [row(mpg_one, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]. ?- mtx_column_replace( mtcars, mpg, mpgp1, @(plus_one()), _, New ). New = [row(mpgp1, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...].
Hdr is protected and added to both Sel and Rej.
Opts
?- Csv = [row(a,b,c),row(1,2,3),row(4,5,6)], csv_column_values_select( Csv, c, 3, Red, _ ). Csv = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Red = [row(a, b, c), row(1, 2, 3)].
% throws error ?- Mtx = [hdr(aa,ab,ba,bb),row(1,2,3)], mtx_name_prefix_column( Mtx, a, Pos, Cnm, Clm ). ?- Mtx = [hdr(aa,ab,ba,bb),row(1,2,3)], mtx_name_prefix_column( Mtx, aa, Pos, Cnm, Clm ). Pos = 1, Cnm = aa, Clm = [1].
?- mtx_relative_pos( 2, 2, _, Pos ). Pos = 4. ?- mtx_relative_pos( -2, 0, c(a,b,c), Pos ). Pos = 2. ?- mtx_relative_pos( -2, 0, c(a,b,c), Nadj, Pos ). Pos = 2.
Opts
by(By=column)
use row to get the report row-wise
frequency(Freq=false)
to report factors, or add number each factor appeared
max(Max=0)
if positive, the maximum number of items to be displayed for each vector. if negative no reporting takes place.
?- mtx( pack(mtx/data/mtcars), Cars ), mtx_factors( Cars, _, [max(5)] ), fail. mpg: [10.4,13.3,14.3,14.7,15.0,...] cyl: [4.0,6.0,8.0] disp: [71.1,75.7,78.7,79.0,95.1,...] hp: [52.0,62.0,65.0,66.0,91.0,...] drat: [2.76,2.93,3.0,3.07,3.08,...] wt: [1.513,1.615,1.835,1.935,2.14,...] qsec: [14.5,14.6,15.41,15.5,15.84,...] vs: [0.0,1.0] am: [0.0,1.0] gear: [3.0,4.0,5.0] carb: [1.0,2.0,3.0,4.0,6.0,...] false. ?- mtx( pack(mtx/data/mtcars), Cars ), mtx_factors( Cars, _, [max(3),frequency(true)] ), fail. mpg: [21.0-2,22.8-2,21.4-2,...] cyl: [6.0-7,4.0-11,8.0-14] disp: [160.0-2,108.0-1,258.0-1,...] hp: [110.0-3,93.0-1,175.0-3,...] drat: [3.9-2,3.85-1,3.08-2,...] wt: [2.62-1,2.875-1,2.32-1,...] qsec: [16.46-1,17.02-2,18.61-1,...] vs: [0.0-18,1.0-14] am: [1.0-13,0.0-19] gear: [4.0-12,3.0-15,5.0-5] carb: [4.0-10,1.0-7,2.0-10,...] false.
column(CidIn,PosOut)
term in Opts column
with Cid, CidIn, is copied from Mtx to MtxOut.
In MtxOut, the column is placed in position PosOut.
The predicate scans Opts as they come, so PosOut should
take account of all operation to its left.
?- M1 = [r(a,b,c),r(1,2,3),r(4,5,6)], M2 = [r(d,e,f),r(7,8,9),r(10,11,12)], mtx_columns_copy( M1, M2, M3, column_copy(c,2) ). M3 = [r(d, c, e, f), r(7, 3, 8, 9), r(10, 6, 11, 12)].
?- mtx_data( mtcars, Mt ), mtx_columns_kv( Mt, mpg, hp, KVs, _, _ ). Mt = [row(mpg, cyl, disp,..)|...], KVs = [21.0-110.0, 21.0-110.0, 22.8-93.0, 21.4-110.0, 18.7-175.0, 18.1-105.0, ... - ...|...].
?- mtx_data( mtcars, Mt ), mtx_header( Mt, Hdr ), mtx_header_cids_order( Hdr, [drat,cyl], Order ). Mt = ..., Hdr = row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), Order = [2, 5].
?- mtx_data( mtcars, Mt ), mtx_dims( Mt, Nr, Nc ). Mt = ..., Nr = 33, Nc = 11. ?- mtx_dims( Mtx, 2, 3 ). Mtx = [row(0, 0, 0), row(0, 0, 0)].
Prolog can be given, in which case it is considered to be a full filename.
If Prolog is free, it instantiates to the filename of the file the facts
were dumped on, or the Rows themselves if consult(consult)
was in Opts.
In what follows, Stem is the first of:
Opts
maplist(Pred)
if you want
to use maplist on each row for Pred rather than the default of calling
Pred with RowsIn and RowsOutGoal is elliptically expanded to an expresssion.
?- assert( mtx1([row(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)] ) ). ?- lib(lists). % this is needed for sum_list/2 ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 0, Mtx2, Excl ). Mtx1 = Mtx2, Mtx2 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Excl = []. ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 12, body, Acc, Rej ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Acc = [row(b, c), row(2, 3), row(5, 6), row(8, 9)], Rej = [row(a), row(1), row(4), row(7)]. ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 15, body, Acc, Rej ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Acc = [row(c), row(3), row(6), row(9)], Rej = [row(a, b), row(1, 2), row(4, 5), row(7, 8)]. ?- assert( (chkmember(List,Elem):-memberchk(Elem,List)) ). ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, chkmember([a,c]), head, Acc, Rej ).
If Mtx, Incl and Excl are ground and non-lists are taken to be files to read/write upon
in which case an optimised version is used, that does not read the whole file
into memory but processes each line as it is read. In this case Incl and Excl
can be the special atom false which will indicated the specified channel is
not required.
Opts
?- assert( (arg_val(N,Val,Row) :- arg(N,Row,Val)) ). ?- mtx_data( mtcars, Mtcars ), mtx_rows_partition( Mtcars, arg_val(1,21.0), Incl, Excl, true ), length( Excl, Nxcl ), maplist( writeln, Incl ), write( xLen:Nxcl ), nl, fail. row(mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb) row(21.0,6.0,160.0,110.0,3.9,2.62,16.46,0.0,1.0,4.0,4.0) row(21.0,6.0,160.0,110.0,3.9,2.875,17.02,0.0,1.0,4.0,4.0) xLen:31
Opts
?- mtx_data( mtcars, MtCars ), mtx_columns_sets( MtCars, Sets, true ), maplist( length, Sets, Lengths ), write( lengths(Lengths) ), nl. lengths([25,3,27,22,22,29,30,2,2,3,6]) ...
Requires pack(mlu)
.
Opts
?- [pack(mtx/examples/ones_plots)]. ones_plots. % displays 2 frequency plots one with a vertical separator line and % the other with 3 frequency groups distinguished by colour.
?- Mtx = [r(a,b,c,d),r(1,0,0,0),r(1,1,0,0),r(1,1,1,0)], maplist(writeln,Mtx), mtx_value_column_frequencies(Mtx,1,VC). r(a,b,c,d) r(1,0,0,0) r(1,1,0,0) r(1,1,1,0) Mtx = [r(a, b, c, d), r(1, 0, 0, 0), r(1, 1, 0, 0), r(1, 1, 1, 0)], VC = [a-3, b-2, c-1, d-0].
Opts
binary(Bin=true)
when true only record absense/presense, else record number of occurances
sort_rows(Sr=true)
sorts rows according to row name
sort_columns(Sc-trye)
sorts columns according to column names
?- Mtx = [w(lets,nums),w(a,1),w(a,2),w(b,3),w(c,2),w(c,3)], mtx_columns_cross_table( Mtx, lets, nums, Tbl, true ), maplist( writeln, Mtx ), maplist( writeln, Tbl ). w(lets,nums) w(a,1) w(a,2) w(b,3) w(c,2) w(c,3) hdr(,1,2,3) row(a,1,1,0) row(b,0,0,1) row(c,0,1,1) Mtx = [w(lets, nums), w(a, 1), w(a, 2), w(b, 3), w(c, 2), w(c, 3)], Tbl = [hdr('', 1, 2, 3), row(a, 1, 1, 0), row(b, 0, 0, 1), row(c, 0, 1, 1)].
mtx_pos_elem/5 can be used to generate all positions and elements
Please note this uses the canonical representation and not optimised for other formats.
Opts
?- Mtx = [row(a,b,c),row(1,2,3),row(4,5,6)], assert( a_mtx(Mtx) ). ?- a_mtx(Amtx), mtx_pos_elem(Amtx,I,J,Elem,true). Amtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], I = J, J = Elem, Elem = 1 ; ... ?- a_mtx(Amtx), mtx_pos_elem(Amtx,2,3,0,Bmtx,true). Amtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Bmtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 0)].
value(Val)
=DefV when you want to set the elements that fail ij_constraintcall(Gname,Scf,I,J,Elem|Gargs,NtxScf)
, else
it is call(Gname,Elem|Gargs,OutElem)
?- Mtx = [row(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], assert( a_mtx(Mtx) ). ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, true ). Bmtx = [row(a, b, c), row(2, 3, 4), row(5, 6, 7), row(8, 9, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, ij_constraint(<) ). Bmtx = [row(a, b, c), row(1, 3, 4), row(4, 5, 7), row(7, 8, 9)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(bottom)] ). Bmtx = [row(a, b, c), row(0, 0, 4), row(0, 6, 7), row(8, 9, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(top)] ). Bmtx = [row(a, b, c), row(2, 3, 4), row(0, 6, 7), row(0, 0, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(top)] ). Bmtx = [row(a, b, c), row(0, 3, 4), row(0, 0, 7), row(0, 0, 0)].
Types:
asserted (atomic) when Mtx is not a current handle and given that predicate Mtx/1 exists with its argument instantiating to a list, this list is taken to be a matrix in canonical representation
by_column (list of lists) which is assumed to be a per-column representation (see mtx_lists/2)
by_row (list of compounds) such as those read in with csv_read_file/2 but there is no restriction on term name and arity. this is the canonical representation and each term is a row of the matrix
predicated (Pid of the form Pname/Arity) where the atom Pname corresponds to a predicate name and the predicate with arity N is defined to succeeds with the returned arguments
predfile (atomic) when Mtx is not a current mtx handle and given that predicate Mtx/1 exists with its argument instantiating to a non-list; this argument is taken to be the stem (with possible exts csv and tsv) or filename of a csv/tsv file which csv_read_file/3 can read as a canonical matrix
on_file (ground; non-list) (atomic or compound: csv file or its stem) as possible to be read by csv_read_file/2 alias paths and normal delimited file extension can be ommitted
asserted (atomic)
atomic, when mtx was cached at loading time (see option cache(Cache)
in mtx/3)
If Mtx is a list, its contents are first checked for sublists (by_column) and then
for compounds (by_row). When Mtx is a predicate identifier of the form Pname/Arity,
it is taken to define the corresponding Mtx (predicated). If Mtx is atomic the options are
Mtx matrix handle exists (see mtx/2)
then the type is in_memory
Mtx/1 is defined and returns a list
type is asserted
Mtx/1 is defined and returns a non list
type on_file(File)
?- mtx_type( [[a],[b],[c]], Type ). Type = by_column. ?- mtx_type( [r(a,b,c),r(1,2,3),r(4,5,6)], Type ). Type = by_row. ?- mtx_type( pack(mtx/data/mtcars), Type ). Type = on_file. % was: Type = on_file('/usr/local/users/na11/local/git/lib/swipl-7.3.29/pack/mtx/data/mtcars.csv'). ?- assert( mc_file(pack(mtx/data/mtcars)) ). ?- mtx_type( mc_file, Type ). ?- mtx( pack(mtx/data/mtcars), Mtx, cache(mtcars) ), assert(mc(Mtx)). ?- mtx_type( mtcars, Type ). Type = handled. ?- mtx_type( mc, Type ). Type = asserted. ?- mtx( mc, Mc ), findall( _, (member(Row,Mc),assert(Row)), _ ). ?- mtx( mc, [Hdr|_Rows] ), functor( Hdr, Pname, Arity ), mtx_type( Pname/Arity, Type ). Hdr = ..., Rows = ..., Pname = row, Arity = 11, Type = predicated.
OptS
match_arity(Match)
rows read in (see csv//2 options).separator(Sep)
option of csv//2 (mtx_sep/2). Defaults to csv//2 version which is based on filename extension.
Any other OptS are passed to csv//2.
As per mtx/3 convention OptS can be a single option (un-listed) or a list of options.
?- tmp_file( testo, TmpF ), csv_write_file( TmpF, [row(c_a,c_b),row(1,a,b),row(2,aa,bb)], [match_arity(false),separator(0'\t)] ), mtx_read_table( TmpF, samples, Tbl, sep(tab) ). TmpF = '/tmp/pl_testo_12445_0', Tbl = [row(samples, c_a, c_b), row(1, a, b), row(2, aa, bb)].
?- assert( ( or_gate(List,And) :- sum_list(List,Sum), ( Sum > 0 -> And is 1; And is 0)) ). ?- Mtx = [r(a,b1,b2,c),r(0,1,0,1),r(0,0,1,0),r(1,0,0,1),r(1,1,1,0)], mtx_columns_collapse( Mtx, [b1,b2], b, or_gate, 2, OutMtx ). Mtx = ... OutMtx = [r(a, b, c), r(0, 0, 1), r(0, 0, 0), r(1, 0, 1), r(1, 1, 0)].
Opts
?- mtx( '../data/mtcars.csv', MtC ), mtx_row_apply( =, MtC, MtA, [] ). MtC = MtA, MtA = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(21.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(18.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(18.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(14.3, 8.0, 360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]. ?- tmp_file( mtcars_clone, TmpF ), mtx_row_apply( =, '../data/mtcars.csv', TmpF, [] ).
Opts
?- mtx_bi_opts( [], true.csv, out.csv, Ins, Outs ). min([])-sin([])-mou([])-sou([sep(44)]) Ins = [], Outs = [sep(44)].
pack(mtx/data)
.
Data is in canonical Mtx format.
SetName
?- mtx( pack(mtx/data/mtcars), Mtcars ), mtx_data(mtcars, Mtcars). Mtcars = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(21.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(18.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(18.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(14.3, 8.0, nle.360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]
Sep can be a code, or one of:
mtx
.
The pack is distributed under the MIT license.
?- mtx_version( Ver, Date ). Ver = 0:1:0, Date = date(2018, 4, 2).
Previously:
Ver = 0:1:0, Date = date(2018, 4, 2).
The following predicates are exported, but not or incorrectly documented.