Skip to content

Commit

Permalink
[API] Data Movement Support Enhancement (#171)
Browse files Browse the repository at this point in the history
* [API] Enable building a function directly from IR (#133)

* add a pass for building a function directly from IR

* remove redundant print statement

* [API] Enable select API to accept Python expressions

* [API] Fixed incorrect casting for select in CastRemover

* [API][Backend] Streaming and OpenCL Backends (#138)

* add sdaccel, aocl for heterocl

* fpga

* Create codeanalys_openclc.cc

* Update target.py

* run

* can run successfully

* Create codegen_opencl.cc

* now

* all done

* Update codegen_sdaccel.cc

* Update codegen_sdaccel.cc

* 	modified:   python/heterocl/tvm/target.py

* 	new file:   samples/ppac/gemm/csrcPrint.py
	new file:   samples/ppac/gemm/data.py
	new file:   samples/ppac/gemm/gemm_ppac.py
	new file:   samples/ppac/gemm/headcode.txt
	new file:   samples/ppac/gemm/ppac_common.py
	new file:   tvm/src/codegen/build_ppac.cc
	new file:   tvm/src/codegen/codegen_rv64_ppac.cc
	new file:   tvm/src/codegen/codegen_rv64_ppac.h

* all

* remove tvm check code from kernel

* opencl-backend

* all

* fix ppac module build

* support ppac MVPb pragma

* fix ignoring ppac pragma in cpu backend

* opencl-backend

* aocl-backend

* move ppac codegen to ppac folder; fix argument name with merlinc analyser

* discard the new for-loop type; include ppac in hlib

* discard some previous changes

* Use int64_t as return type of GeMM on ppac

* [add] codegenc kernedef + stream init

* [add] var_shape_map

* [update] kerneldef struct shape

* [update] use noderef and restore

* [fix] return op

* [add] hcl device & kernelstmt printer

* [fix] def workaround

* [update] stream example

* [add] stream expr & stmt ir

* [fix] kernel arg location for stream

* opt1

* opencl-general

* new-version

* no bug

* a

* test+unroll+pipeline

* pragma

* new

* type has fixed

* new_test

* test_reorder_split_fuse

* target

* order

* simplified by rui

* analysis

* bug fixed

* [delete] all of the code about opencl

* [ADD] new opencl back-end including xilinx & intel

* fixed __local

* fixed data_type for xilinx opencl

* add makefile for SDAccel_runtime

* add the runtime for sdaccel

* create the sdaccel host

* fixed the indent problem partly

* test the zhang-05 server

* add indent to the host.cpp

* automatically generate makefile

* delete common folder from opencl

* add shmat to sdaccel runtime

* fixed bug for sdaccel runtime seg fault

* fixed the bug of host.cpp multiple

* fixed host.cpp multiple bug

* fixed endif for makefile

* modify sdaccel_sw_emu -> sdaccel_csim

* fix the __local and __global for intel opencl back-end

* Fix the arbitrary integer precision for aocl

* [add] ir visitor & functor for codegen

* [add] aocl stream codegen

* [add] aocl stream support

* [fix] aocl type conversion

* [fix] aocl channel syntax

* [add] sch.stream_to

* [fix] add stream annotation

* [add] host device codegen

* [add] stream ir mutator

* [Add] Interface prag,a for SDx sim

* [add] host xcel codegen

* [update] build interface

* [update] new build interface

* [fix] temp update

* [update] stream example

* [add] rocc-ppac sim

* [rm] submodule

* [update] rocc ppac hlib

* [add] unified sim & kernel updater

* re-organize build common util

* [update] stream in codegen c

* [update] codegen construct for streaming

* [update] code post-processing

* [fix] test cases

* [fix] python compatibility

* [update] future

* [fix] metaclass

* [fix] test import issue

* Revert "[API][Backend] Streaming and OpenCL Backends (#138)" (#139)

This reverts commit 2c75344.

* [API] Remove support for Python 2 (#143)

* remove support for Python 2

* switch from Python 3.7 to 3.6

* [Backend] Fix LLVM CodeGen for intrinsics (#147)

* Fix llvm codegen for instrinsic log, pow, and sqrt

* fix test case

* [Backend] Fix LLVM power intrinsic with large integer (#151)

* fix llvm power with large integer

* fix test case

* [API] Fix Wrong Index Calculation in API reuse_at (#156)

* test cases

* rename

* systolic

* self loopback

* fix test

* fork and join

* fix host only

* sobel

* [API] Adding printing function to HeteroCL (#178)

* initial attempt for hcl.print

* enable better printing

* finish a version fo hcl.print

* add tests

* [Backend] Fixed Incorrect Behavior When Casting Constants to Very Long Int (#179)

* fixed incorrect behavior in Halide

* add test

* map reduce example

* reshape check

* [backend] Add LLVM 9.0 support (#182)

* [API][Backend] Fix hcl.print with UInt supported (#184)

* memory

* clean up stream type

* codegen update

* hbm support

* host codegen update

* fix auto-merge issue

* fix extern ip

* fix hls ip

* update

* if

* join api

* comment

Co-authored-by: Yi-Hsiang (Sean) Lai <[email protected]>
Co-authored-by: HZ Chen <[email protected]>
  • Loading branch information
3 people authored May 3, 2020
1 parent 220c7a6 commit e322a08
Show file tree
Hide file tree
Showing 66 changed files with 2,554 additions and 614 deletions.
4 changes: 3 additions & 1 deletion Makefile.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
LLVM_CONFIG = $(shell which llvm-config-4.0 2>/dev/null || \
which llvm-config-5.0 2>/dev/null || \
which llvm-config-6.0 2>/dev/null || \
which llvm-config-7.0 2>/dev/null)
which llvm-config-7.0 2>/dev/null || \
which llvm-config-8.0 2>/dev/null || \
which llvm-config-9.0 2>/dev/null)

# set your own path to cmake
CMAKE_CONFIG = $(shell which cmake 2> /dev/null)
Expand Down
12 changes: 6 additions & 6 deletions hlib/python/hlib/ip/fft.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,20 +41,20 @@ def single_fft_hls(X_real, X_imag, F_real=None, F_imag=None, name=None):
hcl.update(F_imag, lambda i: X_imag[Table[i]], name='F_imag_update')

with hcl.Stage("Out"):
one = hcl.scalar(1, dtype="int32")
one = hcl.scalar(1, dtype="int32", name="one")
with hcl.for_(0, num_stages) as stage:
DFTpts = one[0] << (stage + 1)
numBF = DFTpts / 2
e = -2 * np.pi / DFTpts
a = hcl.scalar(0)
a = hcl.scalar(0, "a")
with hcl.for_(0, numBF) as j:
c = hcl.scalar(hcl.cos(a[0]))
s = hcl.scalar(hcl.sin(a[0]))
c = hcl.scalar(hcl.cos(a[0]), name="cos")
s = hcl.scalar(hcl.sin(a[0]), name="sin")
a[0] = a[0] + e
with hcl.for_(j, L + DFTpts - 1, DFTpts) as i:
i_lower = i + numBF
temp_r = hcl.scalar(F_real[i_lower] * c - F_imag[i_lower] * s)
temp_i = hcl.scalar(F_imag[i_lower] * c + F_real[i_lower] * s)
temp_r = hcl.scalar(F_real[i_lower] * c - F_imag[i_lower] * s, "temp_r")
temp_i = hcl.scalar(F_imag[i_lower] * c + F_real[i_lower] * s, "temp_i")
F_real[i_lower] = F_real[i] - temp_r[0]
F_imag[i_lower] = F_imag[i] - temp_i[0]
F_real[i] = F_real[i] + temp_r[0]
Expand Down
21 changes: 9 additions & 12 deletions hlib/python/hlib/op/extern.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,20 +82,17 @@ def with_attrs(f):


# create hls ip invoked within the top function
def create_hls_ip(op, name, args, ip_type="hls", path=None):
def create_hls_ip(stage, name, args, ip_type="hls", path=None):
# must be called within a superstage
assert Stage._current
curr = Schedule.last_stages[-1]
input_ops = [i._op for i in curr.input_stages]
output_bufs = [curr._buf]
input_ops = [i._op for i in stage.input_stages]
output_bufs = [stage._buf]


# include external ip files
def create_extern_module(op, dicts, ip_type="hls", path=None):
curr = Schedule.last_stages[-1]
input_ops = [i._op for i in curr.input_stages]
input_bufs = [i._buf for i in curr.input_stages]
output_bufs = [curr._buf]
def create_extern_module(stage, dicts, ip_type="hls", path=None):
input_ops = [i._op for i in stage.input_stages]
input_bufs = [i._buf for i in stage.input_stages]
output_bufs = [stage._buf]

# input and output arguments
assert "args" in dicts.keys()
Expand All @@ -104,7 +101,7 @@ def create_extern_module(op, dicts, ip_type="hls", path=None):
annotate_dict["input::" + name] = dtype
del annotate_dict["args"]

op = op._op.op
op = stage._op.op
assert ip_type in ["rtl", "hls", "host"]
body = _make.ExternModule(
"top", _make.StringImm(ip_type), op.body,
Expand All @@ -113,5 +110,5 @@ def create_extern_module(op, dicts, ip_type="hls", path=None):
new_op = _ExternOp(
op.name, op.tag, op.axis,
input_ops, input_bufs, output_bufs, body)
curr._op = new_op.output(0)
stage._op = new_op.output(0)

2 changes: 1 addition & 1 deletion pkgs/Makefile.pkg.config
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ else
CMAKE_OK = yes
endif

ifeq ("", "$(findstring $(LLVM_VERSION), 40 50 60 70)")
ifeq ("", "$(findstring $(LLVM_VERSION), 40 50 60 70 80 90)")
DIRS += llvm
endif

101 changes: 94 additions & 7 deletions python/heterocl/api.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
"""This module contains all HeteroCL APIs"""
#pylint: disable=no-member
import numbers
from ordered_set import OrderedSet
from .tvm.build_module import build as _build, lower as _lower
from .tvm.api import convert
from .tvm.api import convert, _IterVar
from .tvm import _api_internal as tvm_api
from .tvm import schedule as _schedule
from .tvm import make as _make
from .tvm import call_intrin
from .tensor import Scalar, Tensor
from .tvm import expr as _expr, stmt as _stmt, make as _make
from .tensor import Scalar, Tensor, TensorSlice
from .schedule import Stage, Schedule
from .scheme import Scheme
from . import util
Expand Down Expand Up @@ -142,8 +143,9 @@ def algo(A):
"""
if not isinstance(inputs, list):
inputs = [inputs]
func(*inputs)
for op in Schedule.stage_ops:
with Stage("_top") as top:
func(*inputs)
for op in top.substages:
func.__setattr__(op.name, op)
return Scheme(inputs, func)

Expand Down Expand Up @@ -201,15 +203,17 @@ def algo(A):
Schedule.stage_ops = []
Schedule.last_stages = OrderedSet([])
# execute the algorithm
ret = func(*inputs)
with Stage("_top") as top:
ret = func(*inputs)
# append the output tensors to the input list
if ret is not None:
if isinstance(ret, tuple):
inputs += list(ret)
else:
inputs.append(ret)
# let each stage be an attribute of the function
for op in Schedule.stage_ops:
for op in top.substages:
#op = stage._op
func.__setattr__(op.name, op)
t = Schedule.last_stages
ops = [t_._op.op for t_ in t]
Expand Down Expand Up @@ -360,3 +364,86 @@ def select(cond, true, false):
Expr
"""
return _make.Select(convert(cond), convert(true), convert(false))

def print(vals, format=""):
"""Print a HeteroCL object.
Parameters
----------
vals : Expr or list of Expr
The values to be printed
format : string, optional
The printing format similar to printf
Returns
-------
None
"""
if not isinstance(vals, (tuple, list)):
vals = [vals]

def get_format(val):
if isinstance(val, (TensorSlice, Scalar, _expr.Expr)):
if (util.get_type(val.dtype)[0] == "int"
or util.get_type(val.dtype)[0] == "uint"):
return "%lld"
else:
return "%f"
elif isinstance(val, int):
return "%d"
elif isinstance(val, float):
return "%f"

def print_tensor(val, ivs, i, ndim):
if i == 0: #inner-most
iv = ivs[ndim-1]
stmt = _make.Print([], "[")
value = val[tuple(ivs)]
body = _make.Print([value], get_format(value))
ite = _make.IfThenElse(iv < iv.dom.extent-1,
_make.Print([], ", "),
_make.Evaluate(0))
body = _make.Block(body, ite)
loop = _make.For(iv.var, iv.dom.min, iv.dom.extent, 0, 0, body)
stmt = _make.Block(stmt, loop)
stmt = _make.Block(stmt, _make.Print([], "]"))
return stmt
else:
iv = ivs[ndim-1-i]
stmt = _make.Print([], "[")
body = print_tensor(val, ivs, i-1, ndim)
ite = _make.IfThenElse(iv < iv.dom.extent-1,
_make.Print([], ",\n"),
_make.Evaluate(0))
body = _make.Block(body, ite)
loop = _make.For(iv.var, iv.dom.min, iv.dom.extent, 0, 0, body)
stmt = _make.Block(stmt, loop)
stmt = _make.Block(stmt, _make.Print([], "]"))
return stmt

def print_val(val):
stage = Stage.get_current()
if isinstance(val, (Scalar, _expr.Expr, numbers.Number)):
stage.emit(_make.Print([val], get_format(val) + "\n"))
elif isinstance(val, TensorSlice) \
and len(val.indices) == len(val.tensor.shape):
stage.emit(_make.Print([val], get_format(val) + "\n"))
else: # we are dealing with tensors
nshape = len(val.tensor.shape)
ndim = nshape
if isinstance(val, TensorSlice):
ndim = nshape - len(val.indices)
args = ["print_"+str(n) for n in range(0, ndim)]
ivs = [_IterVar((0, val.tensor.shape[nshape-n-1]), args[n], 0) \
for n in range(0, ndim)]
import builtins
stage.emit(print_tensor(val, ivs, ndim-1, ndim))
stage.emit(_make.Print([], "\n"))

if format == "":
for val in vals:
print_val(val)
else:
stage = Stage.get_current()
stage.emit(_make.Print(vals, format))
1 change: 1 addition & 0 deletions python/heterocl/compute_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -670,6 +670,7 @@ def assign_val(*indices):

return compute(tuple(new_shape), assign_val, name, dtype)


def reduce_axis(lower, upper, name=None):
"""Create a reduction axis for reduction operations.
Expand Down
Loading

0 comments on commit e322a08

Please sign in to comment.